Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!cs.utexas.edu!usc!apple!baum From: baum@Apple.COM (Allen J. Baum) Newsgroups: comp.arch Subject: Re: Why The Move To RISC Architectures? ('386 vs. RISC) Message-ID: <39746@apple.Apple.COM> Date: 22 Mar 90 23:38:49 GMT References: <28012@cup.portal.com> <289@emdeng.Dayton.NCR.COM> Reply-To: baum@apple.UUCP (Allen Baum) Organization: Apple Computer, Inc. Lines: 51 [] >In article <289@emdeng.Dayton.NCR.COM> hrich@emdeng.UUCP (George.H.Harry.Rich) writes: >.In article <28012@cup.portal.com> Will@cup.portal.com (Will E Estes) writes: >.>architectures really tell you anything of worth? >... >.> >.>Finally, why is everyone so excited about RISC? Why the move to >.>simplicity in microprocessor instruction sets? You would think >.>that the trend would be just the opposite - in order to increase >.> the speed of very high-level instructions by putting them in silicon > Actually, the problem with cmoplex stuff is that it isn't used, so why put it in. The higher the semantic content, the less often it is used. RISC attempts to put the highest semantic content in that gets used a lot- which isn't very high, it turns out. >First of all, what you save on a complex instruction versus several simple >ones is the fetch and decode time. If the processor has good prefetch and >caching what you are generally talking about is decode time. However, >a really simple instruction set takes less time to decode, Yes, but if your critical paths are not decode related, then it just doesn't matter. Reducing critical paths (both in hardware, where it is generally load/store or branch related, and software, which is '# of inst.s to perform some function'. CISCs attempt to reduce the second (software) factor. Unfortunately, they often do this by increasing the first, and they can't do it often enough to make up for this. You can make instructions that perform the same actions as a series of simpler instructions. I can make n^i variations of the latter, and few variations of the former. Experience has shown that lots of variations get used, especially after optimization, so that it is impossible to pick a small set of complex insts. that get used enough to make them worthwhile. Besides, these complex insts. often get executed as a series of microsteps, and often go no faster than the series of simple instructions. Finally, it is possible to re-arrange the order of the simpler ones to avoid interlocks, which can't happen inside a complex instruction. On the flip side, complex instructions can run a deeper pipeline. If the instructions can truly be piped (a very big if, when interlocks are taken into account), then this is equivalent to a cheap 'superscalar' implementation. For example, a series of "Add Mem to Reg" instructions, which can be piped at one per cycle, will run twice as fast as the simpler "Load Mem to Reg", "Add Reg to Reg" series. The pipeline is more complex, but is simpler than the full superscalar implementation. The question is, with good register allocation does it happen enough to make it worthwhile? -- baum@apple.com (408)974-3385 {decwrl,hplabs}!amdahl!apple!baum