Path: utzoo!utgpu!utstat!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!cbmvax!jesup From: jesup@cbmvax.UUCP (Randell Jesup) Newsgroups: comp.arch Subject: Re: When is RISC not RISC? Message-ID: <5964@cbmvax.UUCP> Date: 13 Feb 89 21:07:19 GMT References: <4592@tekgvs.LABS.TEK.COM> <8476@aw.sei.cmu.edu> Reply-To: jesup@cbmvax.UUCP (Randell Jesup) Organization: Commodore Technology, West Chester, PA Lines: 76 In article <8476@aw.sei.cmu.edu> firth@bd.sei.cmu.edu (Robert Firth) writes: >In article jk3k+@andrew.cmu.edu (Joe Keane) writes: >>As much as i dislike VAX instruction encoding, i can't agree with this. >>Single-size instructions are nice, but you'll pay a price in code density. The >>RT has two instruction sizes, and i think it was the right choice. > >This issue has been argued quite vigorously in the DoD RISC program, >and I'd like to offer an unobjective opinion. > >It seems pretty clear that instruction density can be improved by having >more than one length. The GE design used two lengths (16 and 32) and >claimed a 15% improvement in instruction density as a result. This seems >reasonable to me. Not quite. The GE rpm-40 had one instruction size: 16 bits. One of the instructions was 'prefix (PFX)', which supplied 12 bits of immediate for use in the next instruction (most instructions using immediates could use 4 bits of immediate directly, except things like branch, which used 12). You could put multiple PFX instructions before a regular one to build up a 32- bit immediate. Most constants (I think the number was 90%) fall in 4 bits, and close to 99% fall in 16 (1 PFX instruction)). The other disadvantage of 16- bit instructions is two-operand instructions vs 3-operand (though many times three operands aren't needed). The advantage of 16-bit instructions is memory bandwidth. >However, set against that the following > >. more complicated instruction decode Very slightly, since PFX is an instruction, it just routes the result to the immediate value register. The most complex part of this (not very) is shifting the value in the IVR over when a PFX is executed (easy because it's a fixed shift). >. more complicated pipeline management All the reorganizer has to do is keep PFXs with their associated instructions. It does add a small amount of complexity, but not much. >, more complicated Icache design One this you're wrong, since PFX is just another instruction. >. loss of one bit in span of relative branch or call Also wrong, since a branch doesn't need to indicate whether a following word is part of the instruction: it just takes the IVR and masks in the rest of the immediate from the branch. For one pfx, you are limited to 24 bits relative addressing. That's usually enough. If it isn't, use two PFXs for 32 bits relative. >If the goal is pure speed, the question I ask is: are you better off >with 15% more bytes of instructions and a bigger Icache? If you can >raise the hit rate from 93% to 94% you have offset the difference in >instruction size (you fetch 6% rather than 7% for a reduction of 14% >in instructions fetched). Moreover, very little new logic is involved, >just more of the same. I say there is no difference in Icache complexity due to 16-bit instructions. Therefor, you should get twice as many instructions into it, and if the 15% figure is true, then you should get effectively 15% more done with what's in the icache. (This assumes a loop the size of the icache. For other conditions, it may change the hit-rate instead.) >My view (for what it's worth) is that with CURRENT technology it is >better to have all instructions the same length. However, you do >need a big Icache (as I think the evolution of the Mips Inc machines >demonstrates). Icache is what makes the world go around. The next stepping stone: making dcaches more efficient (their current hit-rates are lousy unless they're ridiculously large). This may require yet more integration of back-end software and silicon, or even front-end software. -- Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup