Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!batcomputer!itsgw!steinmetz!uunet!cbmvax!jesup From: jesup@cbmvax.UUCP (Randell Jesup) Newsgroups: comp.arch Subject: Re: When is RISC not RISC? Message-ID: <6084@cbmvax.UUCP> Date: 24 Feb 89 23:38:36 GMT References: <4592@tekgvs.LABS.TEK.COM> <8476@aw.sei.cmu.edu> <5964@cbmvax.UUCP> <644@m3.mfci.UUCP> Reply-To: jesup@cbmvax.UUCP (Randell Jesup) Organization: Commodore Technology, West Chester, PA Lines: 69 In article <644@m3.mfci.UUCP> rodman@mfci.UUCP (Paul Rodman) writes: >In article <5964@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >> Very slightly, since PFX is an instruction, it just routes the result >>to the immediate value register. The most complex part of this (not very) >>is shifting the value in the IVR over when a PFX is executed (easy because >>it's a fixed shift). >> >You mean it takes me *cycles* to build a >4 bit constant??? *Gasp,choke.* >I guess thats fine for some machines, but if you're reading 4 x 64 bit words >from a large array every beat from a common block, you may need lots of >constants without such a penalty. Excuse me, but what does loading from common blocks have to do with constant sizes? As I said, if you look at statistics on usage of constants, there are a very large percentage that will fit in 4 bits. This design was not a total pedal to the metal design, but one that balanced memory speed, size, and bandwidth against processor speed. 16-bit instructions allows us to use much denser, slower and cheaper instruction memories while still running at 40Mhz, and much faster than we would have with 32-bit instructions at 20Mhz (given constant I-Mem speed). Remember, there are other uses of RISC chips than in Unix workstations. Embedded controllers, for one. >> I say there is no difference in Icache complexity due to 16-bit >>instructions. > >There would be on a machine that was trying to do more than one lousy >operation per cycle. Why the flame? I was talking about the design decision between 32-bit instructions and 16-bit ones on the RPM 40. Not many 'RISC' chips execute more than one instruction per cycle, certainly not the RPM-40. It pushed the design rules a fair ways, and had several close-to-critical paths (in other words without process/design rule changes it would be very hard to add a lot to it). >>Therefor, you should get twice as many instructions into >>it, and if the 15% figure is true, then you should get effectively 15% >>more done with what's in the icache. (This assumes a loop the size of the >>icache. For other conditions, it may change the hit-rate instead.) > >Icaches can be made huge. Who cares? People who can't afford huge external ICaches. Also, at the speeds we're talking about, there are loading limitations to how much ram you can attach to the processor. The ICache I was talking about was the on-chip ICache. >> Icache is what makes the world go around. The next stepping stone: >>making dcaches more efficient (their current hit-rates are lousy unless they're >>ridiculously large). > >Why not have a *real* pipelined memory system and a compiler than can >handle more than one miserable outstanding load in flight? Or have both. We do. We can do one load per cycle, the memory systems are pipelined. (The latency is more than one cycle, of course). > Paul Rodman > rodman@mfci.uucp Calm down. -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup