Xref: utzoo comp.arch:21484 comp.lang.misc:6883 Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!spool.mu.edu!news.nd.edu!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin From: hrubin@pop.stat.purdue.edu (Herman Rubin) Newsgroups: comp.arch,comp.lang.misc Subject: Re: Unusual instructions and constructions Summary: What is done by inlining? Message-ID: <7974@mentor.cc.purdue.edu> Date: 15 Mar 91 17:47:36 GMT References: <7571@mentor.cc.purdue.edu> <1991Mar14.013109.16636@kithrup.COM> <1991Mar14.195853.27398@kithrup.COM> Sender: news@mentor.cc.purdue.edu Followup-To: comp.arch Lines: 67 In article <1991Mar14.195853.27398@kithrup.COM>, sef@kithrup.COM (Sean Eric Fagan) writes: > In article <7850@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes: ...................... > >This example has far too many loads and stores. > > 9 memory references. 4 necessary for the calling sequence gcc conforms to. > Three necessary to call another function, since the 'bcs' does not specify > calling routines with values in registers. Two more because arguments > passed in are not in registers. Leaving a total of 0 unnecessary loads and > stores. Could this be improved? Certainly. But not by much. > >Possibly this MIGHT not be > >too important for a division, but how about something like frexp? > > I think it was frexp() that I wrote for berkeley using gcc with inline > assembly. Uhm... I think it had 7 loads and stores, all but two or three of > which would disappear if the function got inlined and optimized. > > >The > >operations may be register-register, in which case all these loads and > >stores are inappropriate. > > Herman: where are you supposed to get the values from? Magic? > Computing q,r = a/b should not even consider a subroutine call. The arguments, or at least most of them, are likely to be the results of previous operations, and hence already in registers. The results are likely to be used in proximal instructions, and hence kept in registers rather than being stored. This IS what decent compilers do for the "standard" operations of + - * / ^ | &. Other operations should be treated in the same way, and not as subroutine calls. > >Also, something this simple should be inlined; > >if a subroutine call, there is the additional save/restore overhead which > >has to be done somewhere. > > Jesus. Guess what, herman: the routine *was* inlined. Take a look at the > original source code again. It is inlined, but it is still in the nature of a subroutine call. These "unusual" constructs should be treated as having general arguments, usually not in specified locations. The example given loaded the arguments and stored the results, even if that were inlined. Somewhat better would be to have the arguments in registers specific to the inlining procedure, and the results in other specified registers. This is not what a decent compiler now does for the operations it understands. The expansion should allow adding to THAT set of operations. To summarize, what should be provided is to allow the compiler to accept the idiom producer's insight into the various ways the job can be done using the machine instructions or previous idioms, and optimize using this information. As I understand an inlined subroutine call, it could not merely issue the instruction idivl a,b,q,r or for some machines something similar to idivl a,b,q movl q',r where q' is the register adjacent to q, assuming that things were in the appropriate registers, and only load/store as needed. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!hrubin(UUCP)