Path: utzoo!attcan!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!decwrl!nestvx.dec.com!neideck From: neideck@nestvx.dec.com (Burkhard Neidecker-Lutz) Newsgroups: comp.arch Subject: Re: Register usage Message-ID: <8905120616.AA07871@decwrl.dec.com> Date: 12 May 89 06:16:50 GMT Organization: Digital Equipment Corporation Lines: 21 Reference: 10189,9851 Yes, the savings have to do with the load delay's from the data cache. Another table in the paper shows the relative savings for a certain allocation method with 52 registers and varying data cache latencies: Program Data cache speed in cycles 1 2 3 4 5 ------------------------------------------------------------------------ Simulator 12 % 19 % 24 % 27 % 29 % Verifier 10 % 15 % 19 % 21 % 23 % So there is an increasing potential for savings with longer data cache latencies, but a faster cache will help you always on those nasty array references (combined with block refills), have a look at the R2000/R3000 speed/Mhz numbers for these effects. And given the availability of ridiciously fast static RAMS (I have a Cypress Semiconductor catalogue in front of me, CY100E474 BiCMOS RAM, 4k, 3 nS...) there doesn't seem a good reason to have at least the first level cache be single cycle, even if you go to > 100 Mhz clocks.