Path: utzoo!attcan!uunet!portal!cup.portal.com!bcase From: bcase@cup.portal.com (Brian bcase Case) Newsgroups: comp.arch Subject: Re: Register usage [was Re: 80486 vs. 68040 code size] Message-ID: <18235@cup.portal.com> Date: 11 May 89 18:54:14 GMT References: <921@aber-cs.UUCP> Organization: The Portal System (TM) Lines: 62 >...you need to generate code assuming that you >have say 1 to 16 register available, and then show that as the number of >register increases, program speed/code size improves significantly. Well, it's old and CISCy stuff, but the paper: Chow and Hennessey, "Register Allocation by Priority-based Coloring," Proc. SIGPLAN Symp. on Compiler Construction, SIGPLAN notices vol. 19, No. 6, June 1984. shows some performance numbers for a variable number of registers. The architectures were to the PDP-10 and the 68000. A max. of 9 registers was available. The fastest performance was achieved when the max. number of regs. was used. >changing the number of registers available to its Sethi-Ullman register >allocator, and then benchmarking a few Unix tools. >They found that in these conditions (CISC machine, no interexpression >optimization, virtually only fixed point computation) speed/code size did >not improve substantially with more than three scratch registers, and four >were plenty. But of course! Is this a surprise? It isn't to me. >I can imagine that for machines not like the 386/68020, e.g. RISC machines >with a reg-reg architecture, more registers may be useful, but as far as I >know there are no figures for this situation. This is an interesting >research project: take GCC for the SPARC, and redo the exercise. Or the AMD >29k compiler, or the MIPS compilers suite, etc... The paper quoted above concluded that for the CISC-ish PDP-10 and 68000, using all 9 available registers was the best (the rest were reserved for exclusive use by the code generator, I think). >I still find it difficult that one would find a substantial difference >(especially given the abundant statistics on the simplicity of the average >expression -- expressions with more than two operators are a rarity) and >indeed the AMD data above seem to say that seven registers is about what a >compiler can use (for expression optimization). This, let me say, looks like >four registers + three for local "register" variables :->. You are forgetting, maybe?, that registers are used in clever ways to avoid save/restore overhead on procedure calls? The following paper: Wall, "Global Register Allocation At Link Time," ACM SIGPLAN conf. on Compiler Construction, June 1986 (sorry, I can get a more complete ref. if anyone wants it). talks about using 52 registers with link-time allocation (the machine, the DECWRL Titan, has 64 GPRs). The allocator tried to keep as many procedure contexts in registers as possible. Neat stuff. >As to the six global registers, their contribution is hard to assess. But on >them let me say that on one thing I agree: global "register" variables (that >unfortunately C does not have, thus forcing the compiler to intuit them) are >demonstrably good in one important case, when the program to which they are >global uses them to cache the state of some automaton, e.g. an interpreter. For the 29K, they are not used for storing data declared to be in the C global scope. They are temporaries used for expression evaluation, etc. If the compiler could put globally-scoped data in global registers (such as can be done by the DECWRL "at-link-time" stuff), many more could be used.