Path: utzoo!attcan!uunet!samsung!uakari.primate.wisc.edu!uflorida!novavax!hcx1!tom From: tom@ssd.harris.com (Tom Horsley) Newsgroups: comp.sys.m88k Subject: Re: Register Allocation (was Re: Info about 88open & standards) Message-ID: Date: 22 Nov 89 12:37:02 GMT References: <1948@psueea.UUCP> <1989Nov14.175806.23483@paris.ics.uci.edu> <5063@tekcrl.LABS.TEK.COM> <2631@yogi.oakhill.UUCP> <1989Nov16.212149.9770@paris.ics.uci.edu> <2647@bushwood.oakhill.UUCP> <1989Nov20.205338.11760@pari Sender: news@hcx1.UUCP Organization: Harris Computer Systems Division Lines: 55 In-reply-to: rfg@ics.uci.edu's message of 21 Nov 89 04:53:38 GMT I can't resist entering this discussion, endless and pointless arguments have always appealed to me :-). I can point out one obvious flaw in the idea that the called routine should do the register saves. We have several benchmarks as well as several real programs (as opposed to the typical benchmark :-) in which it is possible to examine the code and see that the current conventions produce fantastic code. This occurs (quite frequently, I might add) when the outer routine contains loops, and the loops contain subroutine calls. Very often (because 12 registers is really an awful lot of registers) the leaf routine does not need to save any registers at all. (For instance, this is true of quite a lot of the low-level str and mem routines in the C library). This means that the subroutine (which, if you will recall, is called in a loop) runs much faster than it would if it was saving every register it used. Quite often, the outer routine is also not complex enough to require the use of all the registers (r14-r26 is still a lot more registers than I typically have available on a CISC machine). All this boils down to the fact that the current convention may not be perfect, but it is, in practice, pretty nice. This is real code I am talking about, which we have really spent some time examining for real quality issues, not some hypothetical situation in which some magic compiler has figured out how to keep something live in every register all the time. We have a good optimizing compiler with a good register allocator that can do neat tricks like make separate lifetime regions for globals around loops so they can be in memory in regions where there are aliased references and in registers in areas where there are not aliased references, and it turns out that more often than you might expect, there are already enough registers. As a practical matter having more registers would help sometimes, but not as often as you might think. This is all still operating without interprocedural analysis. In any system where each routine is compiled in a vacuum, some calling convention is necessary, and I don't have any difficulty making my highly optimizing compiler with its fancy register allocator deal with any convention. If you go into the domain of a global program database with interprocedural register allocation, you no longer need to be bound by OCS rules, you can make any rules you want because you can define your own interface for each routine to minimize global cost (don't ask me how to actually do this, mind you :-). Also, doing interprocedural register allocation at link time is not necessarily restricted because you have to link in external libraries. The 88k architecture is really quite simple to disassemble and analyze. It is not totally beyond the realm of possibility that the linker could even modify register usage for things like libc.a, even without special support from a global program database. (Essentially, there would be a database, it would just be in a format that is kind of hard to interpret). -- ===================================================================== domain: tahorsley@ssd.csd.harris.com USMail: Tom Horsley uucp: ...!novavax!hcx1!tahorsley 511 Kingbird Circle or ...!uunet!hcx1!tahorsley Delray Beach, FL 33444 ======================== Aging: Just say no! ========================