Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!rutgers!lll-lcc!pyramid!prls!mips!earl From: earl@mips.UUCP (Earl Killian) Newsgroups: comp.arch Subject: Re: D-machine helped spawn RISC Message-ID: <696@gumby.UUCP> Date: Fri, 18-Sep-87 23:14:18 EDT Article-I.D.: gumby.696 Posted: Fri Sep 18 23:14:18 1987 Date-Received: Sun, 20-Sep-87 21:31:11 EDT References: <347@erc3ba.UUCP> <478@esunix.UUCP> <2785@ames.arpa> <6266@apple.UUCP> <6281@apple.UUCP> Lines: 48 In article <6281@apple.UUCP>, bcase@apple.UUCP (Brian Case) writes: > About the only real data that I can offer is that the percentage of > loads/stores for stack-cache machines (RISC II, SPARC, Am29000, etc) is > often about 1/2 that observed in machines with only flat register files > (MIPS, etc.). If the percentage for those machines is really half of the MIPS-style RISC machines, I suspect it is because either those machines have compilers that generate unnecessary non-load/store instructions, or the architecture requires extra non-load/store instructions to get the same work done (e.g. using condition codes in RISC II, SPARC, and address arithmetic in Am29000, etc.). We really need a unit of real work for integer programs, like the flop for fp programs. Then we could measure load/stores per workunit. The data that I have below suggests that for the MIPSco architecture/compiler the savings varies widely, but never gets to 50%. This data is basically the % of load/stores that are due to register save/restore. Register windows would eliminate some, but not all of these (load/stores for window overflow/underflow should be factored in). So these are an upperbound on the savings. (That is savings of load/stores, not of cycles). espresso 0.6% spice 4.0% wolf 5.6% yacc 10% diff 12% compress 12% uopt 18% nroff 28% ccom 38% P.S. I'm actually a fan of register windows, even though the MIPSco architecture doesn't have them. However, I think some of the common wisdom about register windows is wrong (e.g. how many load/stores they save) and overstates their usefulness. This is because the early work was done without the benefit of optimizing compilers. Too bad they didn't; with an optimizing compiler they would have found only half as many physical registers (i.e. silicon) are necessary to get the same performance. The SPARC folks, who do have a good compiler, discovered this too (but too late to change the architecture to take advantage of it). The worst thing about register windows is that they are sometimes used to justify multi-cycle load/store. For a program like spice, you save at most 4% and pay 32% (the % of remaining load/stores in spice) for every extra cycle added to load/store. Yuck.