Path: utzoo!attcan!uunet!mcvax!ukc!dcl-cs!aber-cs!pcg
From: pcg@aber-cs.UUCP (Piercarlo Grandi)
Newsgroups: comp.arch
Subject: Re: Register usage [was Re: 80486 vs. 68040 code size]
Summary: Sethi Ullman is old hat... Ok. Works well, though, within its limits.
Message-ID: <959@aber-cs.UUCP>
Date: 18 May 89 12:53:19 GMT
Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Distribution: eunet,world
Organization: Dept of CS, UCW Aberystwyth
	(Disclaimer: my statements are purely personal)
Lines: 108

In article <17556@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
    In article <949@aber-cs.UUCP> pcg@aber-cs.UUCP (Piercarlo Grandi) writes:

    >... I seem to remember that they used PCC and tuned its (bastardized)
    >Sethi Ullman register allocator, which is both pretty good (even the
    >[S]CC one was quite good) and with no assumptions at all.
    
    (why is the `distribution' `eunet,world'?)

Reason again too embarassing to mention. Will send it to you by mail :-(.
[Actually it is no longer necessary, so will change it]
    
    Anyway:  The Sethi-Ullman algorithm is intended to solve the problem
    of computing the number of registers required to evaluate a single
    expression. [ ... ]

Entirely correct. Indeed I had clarified that 'four registers is plenty' was
about C programs, integer, utilities, expression optimization, on a reg-mem
machine (like the 386-68020 to which my comment originally applied).
    
    Modern compilers, on the other hand, usually take entire functions (or,
    increasingly, take entire programs, typically in the `link' phase) and
    allocate registers globally across the thing.

Indeed the debate had moved on to discuss the number of registers useful for
inter expression optimizers, and/or for reg-reg, floating point, numeric
cases. As to this I have been guarded, not recollecting any DATA; I have
only said that my *hunch* is that a few more registers may be useful for
reg-reg and floating/numeric, and that four to eight additional registers
for inter expression optimization would probably be enough (depending on the
architecture and type of code), based on my experience of designing programs
using "register".

Somebody then offered some DATA points; John Mashey has offered his
recollection that 24-28 is a good number to have on a MIPS (hence their 32),
and some AMD fellow (Tim Olson?) has posted statistics that, using a static
analysis, their compiler can on average fill at most 6+7 registers after
global optimization (I am not clear whether this includes intra expression
optimization or not), thus providing an upper bound to the number of *useful*
registers.

And we have been given by some kind person at WRL some additional data
points, for FORTRAN, floating, numeric, global optimization, on a reg-reg
machine, and a diabolically clever compiler.  These numbers (for that is
probably the worst case) say that 4 to 7 are enough for expression
optimization, and 8 for global optimization, IFF the selection of variables
to cache is done on the basis of dynamic frequency of use; otherwise the
compiler has to use about six/seven times that number of registers (i.e. has
to cache virtually everything in sight) to get the about same improvement,
which while good (up to 30%) is not world shattering either.

    PCC does not now and never has done this sort of analysis; it forgets
    everything between one expression and the next.

That is why I like it; with PCC I can specify ("performance-by-design") what
I know to be dynamically important global variables to retain in registers
across expressions, where it matters, without having a very complex compiler
that implements "performance-by-second-guessing-the-designer", which is not
(according to the WRL figures) terribly effective if done using static
analysis (no surprise), and if done using dynamic figures is OK, but then
why not let the programmer do it, at a huge cost in compiler complexity and
bugginess?

Ironically, large register sets, which have some pretty important costs
(e.g. context switch), may be of benefit in reducing the complexity of
compilers; in a machine with 64-256 registers you can essentially cache
everything in sight, which may mean a less sophisticated optimizer can do
the work (you still have a problem of register *allocation*, maybe across
basic blocks, as a stack etc..., but that is much more mechanistic than
trying to do optimizations -- or not?).
    
    (Typically a peephole optimiser attempts to make up for this later.)

When the PCC peephole optimizer limited itself mostly to cleaning up
redundant code sequences and other truly peephole optimizations it was as
reliable as the compiler (i.e. quite); now that people (e.g. SUN, AT&T Sys V
5.3.2) do global analysis in the c2 "peephole optimizer", you have to mark
those sources that should not be compiled with -O... Progress!

    In compiler circles, Sethi-Ullman numbers are considered rather passe' :-)

Fads... :-). It's old, but it works... And sethi-ullman is almost an order of
magnitude smaller than the very sophisticated algorithms being used now (my,
in PCC the sethi ullman routine is a few dozen lines long!!).

APOLOGIES:

	In some past posting I expounded on why what IMNHO are "competent"
	programmers don't need to rely on global optimizers/many registers.

	This did not imply any judgement on the competence of the person
	whose posting (in which he was suggesting that it was incomp...
	oops, lack of familiarity with the relevant literature and
	frequentation of "goggle-eyed" grad students that made me disagree
	with him), I was commenting on.

	I cannot resist also apologizing, in a different vein :-), to John
	Mashey for his opinion that I may not be aware of what has happened
	in the latest 15-20 years in compiler research, and for later stating
	that I spout falsehoods (numbers, please...); and to Chris Torek's
	for his assumption that somebody among the learned participants to
	this forum still needed details on the "limits" of Sethi-Ullman and
	of the PCC, after my having actually extolled them.

-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk