Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!sun-barr!sun!gammara!khb
From: khb@gammara.Sun.COM (Keith Bierman - SPD Languages Marketing -- MTS)
Newsgroups: comp.arch
Subject: Re: Register usage
Message-ID: <107302@sun.Eng.Sun.COM>
Date: 31 May 89 07:49:24 GMT
References: <259@mindlink.uucp> <25382@ames.arc.nasa.gov> <m0fRx4x-0001fDC@mipon2.intel.com> <1RcY6x#64Zq3Y=news@anise.acc.com>
Sender: news@sun.Eng.Sun.COM
Reply-To: khb@sun.UUCP (Keith Bierman - SPD Languages Marketing -- MTS)
Organization: Sun Microsystems, Mountain View
Lines: 59

In article <1RcY6x#64Zq3Y=news@anise.acc.com> lars@salt.acc.com (Lars J Poulsen) writes:
.... # regs on different machines>>

>
>From a humble applications programmer, who occasionally has written a bit
>of kernel code: The biggest pain with an architecture that exposes too
>large a register file is saving and restoring on context switches. While
>interrupt service routines can ignore this and store only what they
>need, context switches require storing of the entire register set. Or do
>people really feel that the processors today are fast enough that
>scheduling pre-emption is too rare to influence the selection of
>register file size ?
>
>Many years ago I switched from working on (then) Univac-1100 to PDP-11's
>and VAXen; the 1100 had about 44 visible registers in the user set; few
>programs really used more than half of them; worse yet, they were
>asymmetric, with different properties between the three major groups.
>
>The VAX instruction set got more mileage out of its 16 general registers
>than the Univac got out of its 44, and saved many cycles on register
>save/restores.

"modern" compilers tend to do global (well, misnomer, all of a
procedure at once) analysis; older compilers tended to look at smaller
bits of code (block, maybe interblock) ... so "modern" compilers have
the opportunity to do better.

On scientific application codes the bodies of the modules are large
enough, and there are enough other effects, that saving a considerable
number of registers is not necessarily a large component of the total
cost (on one machine, in a former life, it tended to be less than 5%
of the call cost). 

If one inlines "leaf" nodes, very respectable numbers of registers can
get used.

For those who are slightly twisted, the Cydra 5 register usage,
documented in the reccent ASPLOS-III proceedings shows how the
combination of software pipelining and VLIW can eat up registers quite
quickly. 

Upshot: modern compilers can employ as many registers are you can
	design in. If you got 'em, you gotta figure out a way to 
	save 'em. Windows, "multiconnect", register pointers, special
	purpose (including vector) and other solutions are workable.

Naive rationale for infinite (as long as they are free) registers:

constant propagation and common subexpressions, combined with loop
unrolling (or more advanced techniques like percolation scheduling)
result in a need for as many as you got ... and then some.

OS is different from application programs, and it is not clear that
one might not be better off with a special OS register allocation
scheme.
Keith H. Bierman      |*My thoughts are my own. Only my work belongs to Sun*
It's Not My Fault     |	Marketing Technical Specialist    ! kbierman@sun.com
I Voted for Bill &    |   Languages and Performance Tools. 
Opus  (* strange as it may seem, I do more engineering now     *)