Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!spool.mu.edu!news.cs.indiana.edu!rutgers!mit-eddie!uw-beaver!milton!ogicse!intelhf!ichips!inews!iwarp.intel.com!csun!kithrup!sef
From: sef@kithrup.COM (Sean Eric Fagan)
Newsgroups: comp.arch
Subject: Re: register save
Message-ID: <1991Mar12.025211.16612@kithrup.COM>
Date: 12 Mar 91 02:52:11 GMT
References: <1991Mar11.192116.1974@dgbt.doc.ca>
Organization: Kithrup Enterprises, Ltd.
Lines: 74

In article <1991Mar11.192116.1974@dgbt.doc.ca> don@dgbt.doc.ca (Donald McLachlan) writes:
>	The idea of saving the return address in a register, rather than
>on the stack sounds nice to me, but ...
>	This requires that the calling function knows that the called
>function is a leaf function, not very practical from a high level language
>point of view as far as I can tell.

Uhm... that doesn't quite follow.  How did you come up with that conclusion?

The *called* function knows whether or not it's a leaf function or not; this
is fine, as it's the one that has to save the return address or not.  (That
is, if I call any other functions, I will save my return address.  I don't
need to know anything about the functions I call%.  If I don't call any
other functions, then, obviously, I don't need to save the return-address
register.)

----
% One thing you can do, with enough magic software, is to have the linker do
N-level searches on all called functions, and decide what set of registers
it can safely use without saving.
----

>	The only way I can think to generalise this would be to always
>put the return address in a dedicated register. 

That's what happens in *lots* of machines:  crays, 88k's, MIPS's, i860's,
etc.  The call instruction only saves the return address in the register.
Between software and hardware conventions, it's pretty quick (note that they
don't save anything into memory; accessing memory is *slow*, and should be
avoided almost as much as divides should be 8-)).

>This would require that
>the "call" would first push the old contents onto the stack and then
>load in the new return address. 

If you call any functions, you are not a leaf function.  Therefore, as part
of your prologue, you save the return address you were given, and restore it
as part of the function epilogue.  Note that if you only have a couple of
function calls (i.e., not in a loop), then you can save the address for each
call, on the assumption that it might be faster.  E.g.:

	void foo(int i) {
		...
		if (foobar == i)
			blech(i&foobar);
		...
	}

might generate code that looks like:

	foo:
		...	; prologue
		load	r2, foobar
		load	r3, i	; probably already in a register, actually
		bne	r2, r3, skip.0
		and	r2, r3, r4	; assuming a delay slot
		PUSH	r31	; push being a macro
		PUSH	r4
		call	blech
		POP	r31	; restore return address
	skip.0:
		; whatever
	
>	Now that all the mechanics are out of the way (the way I see them)
>only one question remains. HOW MUCH DOES THIS ACTUALLY SAVE???

At worst, it costs nothing more than what is "traditionally" done.  Best
case, it save a few memory accesses, which is a Good Thing.

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.