Xref: utzoo comp.lang.c:26956 comp.lang.misc:4477
Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!yale!cmcl2!lanl!lambda!jlg
From: jlg@lambda.UUCP (Jim Giles)
Newsgroups: comp.lang.c,comp.lang.misc
Subject: Re: function calls
Message-ID: <14272@lambda.UUCP>
Date: 16 Mar 90 00:30:59 GMT
References: <1990Mar15.173408.29622@utzoo.uucp>
Lines: 62

From article <1990Mar15.173408.29622@utzoo.uucp>, by henry@utzoo.uucp (Henry Spencer):
> In article <14268@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes:
> [...]
>>1) Save/restore all register values that are still 'live' values for the
>>   caller.  ...  The problem is: _MOST_ of the procedure call
>>overhead is concentrated in number (1)!  And, this overhead applies to all
>>procedures - including 'leaf' routines.  Basically, 'leafness' has very
>>little to do with procedure call overhead...
> 
> Um, do you want to go back and re-think that statement, Jim?  Leafness has
> a lot to do with it, because a leaf procedure doesn't have to keep things
> neat in case it calls somebody.  In particular, if you assign some of your
> registers to be scratch registers, normally unused at call time, then a
> leaf procedure just uses them for all its internal work and has to save
> and restore *nothing* except in very unusual cases.  This is exactly what
> the Mips compilers do.  [...]

This is also what the Cray compilers do for 'baselevel' routines.  The
problem arises with your assumption that the are such things as "scratch
registers, normally unused at call time."  If your register scheduling
algorithm in the compiler is sufficiently greedy, you will, almost always
have some save/restores to do.  And note: it is almost always a good thing
for your register scheduler to be as greedy as possible.  In fact, the only
exception is the procedure call overhead.  So, I _HAVE_ rethought my first
stetement - and I still stand by it!

> [...]
> The correct statement is that the callee must save/restore all register
> values that are still live for the caller *and that the callee is going
> to overwrite*.  [...]

Quite true.  But, in the context of separate compilation, how does the
compiler _KNOW_ which registers are to be used by the 'callee'?  Or, when
compiling the 'callee', how does the compiler know which values are still
'live' for the 'caller'?  Now, I'm all for the idea of interprocedural
analysis.  I'm even a supporter of having the loader do the code generation
for this very reason.  But right now, the procedure interface is still
one of the most costly parts of any program.

> [...]          Careful register-allocation conventions (with an adequate
> supply of registers) can minimize that number, usually to zero in the case
> of leaf functions.

Careful register-allocation conventions are usually the ones that use the
registers most greedily.  This is because, in general, the more you use
the registers, the faster your code goes.  If your "careful" approach to 
registers is not greedy, how much performance am I loosing to it by not
getting full use out of the hardware availible?  Further: what's "an
adequate supply of registers"?  I know code which can use up as many as
you give me.  In fact, if interprocedural analysis were commonplace
instead of rare, you would probably find that the register set was almost
always completely full (this wouldn't matter since it would be a logical
consequence of register allocation being done on the program instead of
a routine at a time).

These problems may all be solved in the future - even the very near future.
But at present, only MIPS and Cray (the only ones mentioned anyway) have
addressed this problem.  And these two 'solutions' rely on the 'callee'
not using lots of registers and the 'caller' deliberately leaving some
spare ones - but this, in itself, may have a negative impact on performance.

J. Giles