Path: utzoo!utgpu!attcan!uunet!convex!mozart!csmith
From: csmith@mozart.uucp (Chris Smith)
Newsgroups: comp.arch
Subject: Re: register save/restore
Message-ID: <699@convex.UUCP>
Date: 7 Nov 88 09:22:04 GMT
References: <3300037@m.cs.uiuc.edu> <5938@killer.DALLAS.TX.US> <7580@aw.sei.cmu.edu>
Sender: news@convex.UUCP
Lines: 54
In-reply-to: firth@sei.cmu.edu's message of 2 Nov 88 18:14:29 GMT

In article <7580@aw.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth) writes:

> Which is better - caller saves or callee saves?

Convex computers use caller saves; here are a few more observations to
toss in.


> B. Are the strategies equally sound?

> The point I consider most important, is that there is a definite
> semantic asymmetry between the two strategies.  If the caller saves,
> then the caller is saving, locally, his own local state.  

It's worth noting that this also gives the caller a chance to do a free
"context switch" of register contents -- a burst of register-memory
traffic is inevitable no matter who does the save, but putting it in the
caller allows him to capitalize on the opportunity to load up a different
-- more useful -- set of values after the call.


> Rough guesses I have accumulated over time are
>
> * at any call point, caller is using ~ 2/3 of the registers it will
>   use at all (though this is partly due to defects in register
>   allocation strategies)
>
> * on average, a procedure call is (almost) immediately followed by
>   another about 2/3 of the time. This implies that if the caller
>   saves, it will have to save ~ 45 times for every 100 calls.
>
> These two factors together imply that the cost of a caller-saves protocol
> is about 1/3 that of a callee-saves protocol.  (Do you believe that?)

Dynamically, our registers tend to be fuller than that -- they fill up at
the drop of a hat anyway, but if they don't, loop unrolling sees to it
that they do.  But this one:

> * the register may be slaving a known value.  The caller then need not
>   save at all, merely restore.  I find this is true of at least 20% of
>   live registers. 

operates very powerfully on a register machine, where loads are required
for every memory operand.  

All in all, on this machine and on *one* benchmark, the Fortran
validation tests, it looks like caller-saves is just under half the cost
of callee-saves.  (Counting only the registers saves and restores.)


One other point: debuggers are up to hunting through saved frames to find a
variable allocated to R4, but when the variables flit around as they are
prone to do when the domain of register allocation is the intervals between
calls, it puts quite a strain on the debugger tables.