Path: utzoo!utgpu!attcan!uunet!convex!mozart!csmith From: csmith@mozart.uucp (Chris Smith) Newsgroups: comp.arch Subject: Re: register save/restore Message-ID: <699@convex.UUCP> Date: 7 Nov 88 09:22:04 GMT References: <3300037@m.cs.uiuc.edu> <5938@killer.DALLAS.TX.US> <7580@aw.sei.cmu.edu> Sender: news@convex.UUCP Lines: 54 In-reply-to: firth@sei.cmu.edu's message of 2 Nov 88 18:14:29 GMT In article <7580@aw.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth) writes: > Which is better - caller saves or callee saves? Convex computers use caller saves; here are a few more observations to toss in. > B. Are the strategies equally sound? > The point I consider most important, is that there is a definite > semantic asymmetry between the two strategies. If the caller saves, > then the caller is saving, locally, his own local state. It's worth noting that this also gives the caller a chance to do a free "context switch" of register contents -- a burst of register-memory traffic is inevitable no matter who does the save, but putting it in the caller allows him to capitalize on the opportunity to load up a different -- more useful -- set of values after the call. > Rough guesses I have accumulated over time are > > * at any call point, caller is using ~ 2/3 of the registers it will > use at all (though this is partly due to defects in register > allocation strategies) > > * on average, a procedure call is (almost) immediately followed by > another about 2/3 of the time. This implies that if the caller > saves, it will have to save ~ 45 times for every 100 calls. > > These two factors together imply that the cost of a caller-saves protocol > is about 1/3 that of a callee-saves protocol. (Do you believe that?) Dynamically, our registers tend to be fuller than that -- they fill up at the drop of a hat anyway, but if they don't, loop unrolling sees to it that they do. But this one: > * the register may be slaving a known value. The caller then need not > save at all, merely restore. I find this is true of at least 20% of > live registers. operates very powerfully on a register machine, where loads are required for every memory operand. All in all, on this machine and on *one* benchmark, the Fortran validation tests, it looks like caller-saves is just under half the cost of callee-saves. (Counting only the registers saves and restores.) One other point: debuggers are up to hunting through saved frames to find a variable allocated to R4, but when the variables flit around as they are prone to do when the domain of register allocation is the intervals between calls, it puts quite a strain on the debugger tables.