Path: utzoo!attcan!uunet!cs.utexas.edu!usc!orion.oac.uci.edu!uci-ics!rfg From: rfg@ics.uci.edu (Ron Guilmette) Newsgroups: comp.sys.m88k Subject: Re: Register Allocation (was Re: Info about 88open & standards) Message-ID: <1989Nov20.205338.11760@paris.ics.uci.edu> Date: 21 Nov 89 04:53:38 GMT References: <1948@psueea.UUCP> <1989Nov14.175806.23483@paris.ics.uci.edu> <5063@tekcrl.LABS.TEK.COM> <2631@yogi.oakhill.UUCP> <1989Nov16.212149.9770@paris.ics.uci.edu> <2647@bushwood.oakhill.UUCP> Reply-To: Ron Guilmette Organization: University of California, Irvine - Dept of ICS Lines: 213 In article <2647@bushwood.oakhill.UUCP> oakhill!bushwood!phillip@cs.utexas.edu (Mike Phillip) writes: > >First of all, the parameter passing registers (r2-r9) are really >irrelevant to the caller vs. callee discussion, since they are >effectively treated as "caller save" in ALL cases. Right. The're just more "caller save" registers. >... The main point of debate seems to be the >rationale behind the chosen division between caller and callee registers. I didn't realize that I had started a "debate"! I though I was was just giving voice to self-obvious truths! :-) (Note simley face) >>What if there are *never* any dead registers? I *still* haven't seen a good answer to *this* question. >>... it may be possible to use *all* available registers >>for at least *some* productive purpose at *all* points throughout a program. I wish I had said that! (Oops... I did!) :-) > This is where things get tricky... The key point to recognize is that >regardless of compiler technology, there will likely always need to be a >distinction between caller and callee saved registers. Yes. The distinction is valuable. It is needed so that I can tell you what type of saving convention is harmful. "Caller saving consider harmful!" > If all registers were designated as being in "caller save" mode, code... >... some registers >may be "dead" at the point of the call). Perhaps you didn't read the whole of my previous posting. In any case, the important part (where I assert that its possible to have all registers live at all points) is given above. >In cases where the callee needs only a few registers, [a caller saves] > convention would likely result in unnecessary load and store >instructions... Right. Now if you can just see your way clear to generalize this Obvious Truth from one particular register to the set of *all* registers, then my point will have been proven. All I'm trying to say is that what is bad for one register (i.e. being saved/restored unnecessarily) is bad for *all* registers. > At the other extreme, if all registers were used in "callee save" >mode ... the callee >would generate load and store instructions for every register that it >uses, regardless of whether such overhead is warranted by the "live" use >of the same register in the caller. Right, but you are still begging the question and avoiding the issue. What if the saves & restores are *always* warranted by virtue of the fact that *all* registers contain live values at *all* points? In that case, there would be no unnecessary loads or stores in a strictly callee-saves convention. What I'm having trouble understanding is why various people (Mike P. included) seem to be unwilling to accept that this convention (used for decades on CICS machines) may actually be simply *the* best possible convention, regardless of whether or not you seem to have lots of registers to waste or whether or not you have a reduced instruction set. >Thus, some division is needed >between caller and callee saved registers. What you mean to say is that you need to have both kinds. You certainly have not justified that conclusion with any evidence that addresses my "all live, all the time" argument. I'm willing to be convinced, but I have not been yet. >Why did OCS choose 12/12 ? My question exactly. I believe that 0/31 would have been a better choice. (Note that I classify r0 as a callee-saves register because the caller may assume that it is is preserved across calls without doing anything ;-) Also, I give the ratio as 0/31 because I assume C language where a reault comes back (typically) in one register (by convention r2). >I wasn't involved in the decision making process, but the issue only >becomes significant when the following criteria are met: > 1) The callee requires more than 12 registers to perform adequate > optimizations. > 2) The additional "callee save" registers used actually did NOT need > to be saved and restored because their live range did not span the > subroutine call in the caller. I don't believe that #1 even enters into the issue at all. Regarding #2, you are really missing my point. What I was trying (in my own futile way) to say was that the issue is *always* significant bacause callers can always insure that virtually *all* of their registers contain "live" values at all call points. Thus, any registers clobbered by the callee would necessarily have to be saved and restored. >How often does this occur? Empirically speaking, beats me. Who cares? The question is "Has the OCS effectively (and permanently) crippled an otherwise impressive machine, by ignoring the possibility that newer and better compilers can keep more useful live values in more registers over more calls?" >(I don't have any such data at my disposal at the moment). Yep. >But your above >argument that good overall optimization in a compiler will result in most >registers holding live values across calls actually MINIMIZES the effect >of having "too many" caller or callee saved registers. Wrong. It minimizes the negative implications of the possibility that the callee might save/restore too much stuff in a "callee-saves" convention (because it simply never will). On the other hand, having *lots* of live values (because you have an intelligent optimizing compiler) definitely will show a "caller-saves" convention to be what it is, i.e. a rotten way of doing things, and strictly counterproductive. >In fact, if the >register being used in a callee needs to be saved SOMEWHERE (i.e. the >caller has a live value in it across the call), it can be argued that >"callee" saved registers are preferable, since the callee can insert the >saves and restores only in those flow paths where the register contents >are clobbered. In such cases, "caller save" conventions can result in >unnecessary overhead. Good. You have seen the light. Now just apply this Obvious Truth iteratively to the set of *all* registers. > >Now onward to inter-procedural analysis... > > (A good reference for this topic is "Minimizing Register Usage > Penalty at Procedure Calls" by Fred Chow from SIGPLAN Notices, July 88) > > As described in the above paper, "callee save" registers can >effectively be treated as "caller save" registers by performing a >depth-first traversal of the program call graph. (i.e. by the time the >caller is analyzed, it has all relevant register usage info about the >callee). The callee can then defer saving and restoring to the caller, >making the register appear as though it was operating in "caller save" >mode. There are limitations to this approach, however. You can say that again! >These >limitations occur when the compiler does not have complete register >usage information for a particular subroutine (libraries are a commonly >cited example). Or any sort of external routine! In other words this serious limitation will apply to *most* calls in any large program. A callee-saves convention has no such limitations or problems of this sort. >In such cases, even "globally-optimizing, inter-procedural analyzing" >compilers need to resort to a default convention, and the trade-offs of >caller vs. callee once again become relevant. Right. >Thus, OCS does not prevent >optimizing compilers from taking advantage of inter-procedural register >allocation, but provides the "default" for those cases when such >optimizations cannot be made. Unfortunately compiler writers only want to support one "convention". As it is, I fear that many such people are effectively being forced to blindly conform this inadequate and mediocre default convention for *all* calls so that they may be assured of being able to have the code they generated *interoperate* with "standard" libraries and/or with code produced by other compilers. >Yes, the inclusion of both caller and callee save registers is a >compromise... always has been, and probably always will... If it seemed as though it was a political compromise in which everybody got a part of what they wanted, then that would be be one thing. But this is obviously *not* a political compromise. It is technical compromise between excelent performance and lousy performance. Nobody won and everybody lost. >I disagree, however, that the compromise is in place to mask compiler >deficiencies If what you are saying is true (i.e. if this is strictly a reasonable technical compromise) then please tell me if 88open collected any empirical evidence about *large* programs at various possible "compromise points" along the scale from 0/12 to 12/0. If they did not, perhaps they would like to arrange to do so before the final draft of the OCS is cast into stone. I know that the GNU C compiler allows the user to vary the number of registers in each category, so the tests (benchmarks of the calling conventions we could call them) should be quite simple to perform and would yield absolutely clear cut evidence of major significance to all 88open members. If I'm wrong that "callee-saves" always produces better performance (with smaller executables to boot) then such tests could also have another significant benefit. They could shut me up, and I'm sure everybody would be in favor of that! :-) >Geez, I figured most of the register allocation debates would involve >r26-r29, which are the so-called "linker registers". ;^) Well get to that. Later, later... // rfg