Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!bloom-beacon!apple!rutgers!bellcore!texbell!killer!elg From: elg@killer.Dallas.TX.US (Eric Green) Newsgroups: comp.arch Subject: Re: 80486 vs. 68040 code size [really: how many regs] Message-ID: <8082@killer.Dallas.TX.US> Date: 12 May 89 01:51:53 GMT References: <927@aber-cs.UUCP> Distribution: eunet,world Organization: The Unix(R) Connection, Dallas, Texas Lines: 105 in article <927@aber-cs.UUCP>, pcg@aber-cs.UUCP (Piercarlo Grandi) says: > In article <19063@winchester.mips.COM> mash@mips.COM (John Mashey) writes: >> >> The simplicity of source statements has little to do with the number of >> registers desirable, unless the only compiler your have generates code >> on a statement-by-statement basis only, i.e., no optimization. >> ^^^^^^^^^^^^^^^^^^^^^ >> Optimization is not just (and maybe even not most importantly) inter >> statement... >> >> For example, consider a typical RISC (i.e., load/store), and the C stmts: >> a = b + 5; >> c = b + 7; >> [ .... ] >> > Your example works, but under special case assumptions: that you are working on > a reg-reg architecture, whereas we were discussing reg-mem ones; that putting > all three a,b,c in registers is worthwhile because they are going to be used > heavvily in other parts of the program. I program a 68000 a lot. I suspect a 68000 is a fairly typical reg-memory machine. I write a lot of "C" code. The "C" compiler I'm using is fairly PCC-like, i.e. loses all register values between expressions. In performance-critical portions of the code, I end up modifying the assembly language output to keep as many values in registers as possible. This is especially important with globals, because they can't be put into registers normally. Presto, fewer instruction bytes, as much as 20% improvement in performance. If only the compiler did it for me, eh? But wait -- I've used such a machine -- a Pyramid 90x. It has a fairly state-of-the-art compiler with a global optimizer, and basically ignores "register" declarations. When I try to go into the assembly code there and speed it up, guess what? I can't improve the register allocation one iota. Anecedotal evidence, certainly. But good enough for me to conclude that having lots of registers and a good global optimizer is a Good Thing. People who disagree with that must have never looked at the assembler-language output of their code on different machines.... > used little. In a reg-mem architecture little use variables in memory do not > carry costs as high when you use them. Foo. If you use the variable three times, you've saved 4 memory fetches (2 addresses, 2 data) as vs. keeping it in a register. No matter what kind of machine you're using. >> Why don't you like inter-expression register assignments? > Well, I like them, as long as the compiler does not do them, but the > programmer does, by using explicit "register" declarations. But Have you ever looked at the output of a "C" compiler, and compared the output of GCC to PCC (i.e. optimizing compiler vs. non-optimizing)? Re-compiling a program with GCC is a sure-fired way of speeding it up by 20% ;-). > pay the price, you take your chances. Me, my idea of RISC is a (mostly) zero > address architecture with 8/12 bit instructions, and four (to avoid extra > push/pop pairs in multiplexing a single one for the up to four independent > computations) arith stacks. Uh, excuse me, have you ever read the various RISC papers? Reaching over to my handy boxfull of 6x5 cards.... urk, can't find the one I wanted, the 1986 David Patterson overview in CACM, but it should be easy enough to find. I assume you know how to use the indexes in your library? Look up a few RISC papers, then come back. Then you'll be able to argue convincingly about the various merits of register-register vs. register-memory vs. stack models (hints: pipelining, locality of reference >80% for code memory, cache size, program-fetch bandwidth vs. data fetch bandwidth, ...). If you have something to say that wasn't said in the latest CISC/RISC wars, I'm sure everybody would appreciate hearing it -- but a word of warning, not much was left UNSAID. (the biggest argument of the RISC guys is that, because of locality of reference program-wise, program memory bandwidth is almost unlimited... it's data-memory bandwidth that's now the main limit, which is why they want lots of registers and mostly register-to-register operations. CISC folks, of course, say that those qualities certainly aren't restricted only to RISC.... but, anyhow, your reg-mem architecture is blown all to heaven by what the RISC folks have actually PRODUCED IN SILICON, i.e. it is NOT the bemused speculations of a goggle-eyed grad student). > Note that It is assumed that RISC == reg-reg, and that load-store == reg-reg; > neither these equations are necessarily true, as one could have RISC == > stack-stack or load-store == stack-stack... stack-stack must be reg-reg in order to be adequately fast. AT&T had a novel architecture some years back called CRISP, which, if I recall, was a stack-oriented RISC machine. AT&T eventually decided to use SPARC for their RISC processor instead, in the event they built a RISC-based computer... I don't really know what became of all that, alas. It's possible to organize registers in a number of different, novel ways. stack-stack saves on program-memory bandwidth, at the cost of reducing flexibility of register use. But note that program-memory bandwidth is the one thing there's no shortage of. -- | // Eric Lee Green P.O. Box 92191, Lafayette, LA 70509 | | // ..!{ames,decwrl,mit-eddie,osu-cis}!killer!elg (318)989-9849 | | // Join the Church of HAL, and worship at the altar of all computers | |\X/ with three-letter names (e.g. IBM and DEC). White lab coats optional.|