Path: utzoo!attcan!uunet!mcvax!ukc!dcl-cs!aber-cs!pcg From: pcg@aber-cs.UUCP (Piercarlo Grandi) Newsgroups: comp.arch Subject: Re: Register usage [was Re: 80486 vs. 68040 code size] Summary: Sethi Ullman is old hat... Ok. Works well, though, within its limits. Message-ID: <959@aber-cs.UUCP> Date: 18 May 89 12:53:19 GMT Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi) Distribution: eunet,world Organization: Dept of CS, UCW Aberystwyth (Disclaimer: my statements are purely personal) Lines: 108 In article <17556@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: In article <949@aber-cs.UUCP> pcg@aber-cs.UUCP (Piercarlo Grandi) writes: >... I seem to remember that they used PCC and tuned its (bastardized) >Sethi Ullman register allocator, which is both pretty good (even the >[S]CC one was quite good) and with no assumptions at all. (why is the `distribution' `eunet,world'?) Reason again too embarassing to mention. Will send it to you by mail :-(. [Actually it is no longer necessary, so will change it] Anyway: The Sethi-Ullman algorithm is intended to solve the problem of computing the number of registers required to evaluate a single expression. [ ... ] Entirely correct. Indeed I had clarified that 'four registers is plenty' was about C programs, integer, utilities, expression optimization, on a reg-mem machine (like the 386-68020 to which my comment originally applied). Modern compilers, on the other hand, usually take entire functions (or, increasingly, take entire programs, typically in the `link' phase) and allocate registers globally across the thing. Indeed the debate had moved on to discuss the number of registers useful for inter expression optimizers, and/or for reg-reg, floating point, numeric cases. As to this I have been guarded, not recollecting any DATA; I have only said that my *hunch* is that a few more registers may be useful for reg-reg and floating/numeric, and that four to eight additional registers for inter expression optimization would probably be enough (depending on the architecture and type of code), based on my experience of designing programs using "register". Somebody then offered some DATA points; John Mashey has offered his recollection that 24-28 is a good number to have on a MIPS (hence their 32), and some AMD fellow (Tim Olson?) has posted statistics that, using a static analysis, their compiler can on average fill at most 6+7 registers after global optimization (I am not clear whether this includes intra expression optimization or not), thus providing an upper bound to the number of *useful* registers. And we have been given by some kind person at WRL some additional data points, for FORTRAN, floating, numeric, global optimization, on a reg-reg machine, and a diabolically clever compiler. These numbers (for that is probably the worst case) say that 4 to 7 are enough for expression optimization, and 8 for global optimization, IFF the selection of variables to cache is done on the basis of dynamic frequency of use; otherwise the compiler has to use about six/seven times that number of registers (i.e. has to cache virtually everything in sight) to get the about same improvement, which while good (up to 30%) is not world shattering either. PCC does not now and never has done this sort of analysis; it forgets everything between one expression and the next. That is why I like it; with PCC I can specify ("performance-by-design") what I know to be dynamically important global variables to retain in registers across expressions, where it matters, without having a very complex compiler that implements "performance-by-second-guessing-the-designer", which is not (according to the WRL figures) terribly effective if done using static analysis (no surprise), and if done using dynamic figures is OK, but then why not let the programmer do it, at a huge cost in compiler complexity and bugginess? Ironically, large register sets, which have some pretty important costs (e.g. context switch), may be of benefit in reducing the complexity of compilers; in a machine with 64-256 registers you can essentially cache everything in sight, which may mean a less sophisticated optimizer can do the work (you still have a problem of register *allocation*, maybe across basic blocks, as a stack etc..., but that is much more mechanistic than trying to do optimizations -- or not?). (Typically a peephole optimiser attempts to make up for this later.) When the PCC peephole optimizer limited itself mostly to cleaning up redundant code sequences and other truly peephole optimizations it was as reliable as the compiler (i.e. quite); now that people (e.g. SUN, AT&T Sys V 5.3.2) do global analysis in the c2 "peephole optimizer", you have to mark those sources that should not be compiled with -O... Progress! In compiler circles, Sethi-Ullman numbers are considered rather passe' :-) Fads... :-). It's old, but it works... And sethi-ullman is almost an order of magnitude smaller than the very sophisticated algorithms being used now (my, in PCC the sethi ullman routine is a few dozen lines long!!). APOLOGIES: In some past posting I expounded on why what IMNHO are "competent" programmers don't need to rely on global optimizers/many registers. This did not imply any judgement on the competence of the person whose posting (in which he was suggesting that it was incomp... oops, lack of familiarity with the relevant literature and frequentation of "goggle-eyed" grad students that made me disagree with him), I was commenting on. I cannot resist also apologizing, in a different vein :-), to John Mashey for his opinion that I may not be aware of what has happened in the latest 15-20 years in compiler research, and for later stating that I spout falsehoods (numbers, please...); and to Chris Torek's for his assumption that somebody among the learned participants to this forum still needed details on the "limits" of Sethi-Ullman and of the PCC, after my having actually extolled them. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk