Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!rutgers!sri-spam!ames!ucbcad!ucbvax!decvax!tektronix!uw-beaver!cornell!rochester!pt.cs.cmu.edu!sei.cmu.edu!firth From: firth@sei.cmu.edu.UUCP Newsgroups: comp.arch Subject: Re: subroutine frequency Message-ID: <540@aw.sei.cmu.edu.sei.cmu.edu> Date: Fri, 30-Jan-87 08:32:29 EST Article-I.D.: aw.540 Posted: Fri Jan 30 08:32:29 1987 Date-Received: Sat, 31-Jan-87 19:39:15 EST References: <1881@homxc.UUCP> <898@moscom.UUCP> Sender: netnews@sei.cmu.edu Reply-To: firth@bd.sei.cmu.edu.UUCP (PUT YOUR NAME HERE) Organization: Carnegie-Mellon University, SEI, Pgh, Pa Lines: 46 Keywords: register stack frame variable In article <898@moscom.UUCP> jgp@moscom.UUCP (Jim Prescott) writes: >It depends on the architecture and the compiler, the three easy ways >to do it are: > a) save all registers > b) have the caller save only the registers it is using > c) have the callee save only the registers it will use >pdp-11's use "a" since they only have 3 register variables anyway. Most >68k compilers use "c" since you get about 12 register variables. I don't >know of anyone who uses "b" but it should be about as efficient as "c". > >The method used has a large effect on whether setjmp/longjmp can put the >correct values back into register variables (SYSVID says they may be >unpredictable :-(. The codegenerators I wrote for the PDP-11 and VAX-11 use method (b). The main reason for this was precisely the longjump problem: if local frames store non-local state, then that state can be restored only by the very slow process of unwinding the stack. If on the other hand each frame keeps its own state, a longjump is just (reset stack pointer; jump), or at worst, if you keep a closed stack, (reset frame pointer; jump; destination resets stack front pointer). The alternative of having the longjump NOT restore state that happened to be in registers was not permitted by the languages in question. Well, I benchmarked this technique against the alternative of having the callee save, and it came out better on both machines. Surprising for the Vax, since that has hardware to save and restore registers, but in fact the special instructions are only marginally faster than doing it by hand. (Of course, one does not use CALLS under any circumstances; using JSB, controlling the local-variable stack yourself, and passing parameters in registers buys you a factor of three in the procedure call overhead.) The main reasons for the difference are interesting: (a) fewer registers are involved. This is because the callee must save every register it uses ANYWHERE in its body, whereas the caller need save only registers CURRENTLY LIVE. (b) fewer memory accesses. Callee must save and restore always; caller can restore the register from a declared variable some (~1/3) of the time, and so need not save it. For other methodological reasons too boring to post here, I am a firm believer in the "caller save always; callee save never" technique. Robert Firth