Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!crdgw1!CRD.GE.COM From: oconnordm@CRD.GE.COM (Dennis M. O'Connor) Newsgroups: comp.arch Subject: Re: CISC Silent Spring Message-ID: <5182@crdgw1.crd.ge.com> Date: 9 Feb 90 15:35:34 GMT References: <3300098@m.cs.uiuc.edu> <771@sce.carleton.ca> <35456@mips.mips.COM> <25cb6b65.702c@polyslo.CalPoly.EDU> <7826@pt.cs.cmu.edu> <3562@odin.SGI.COM> <35647@mips.mips.COM> <38462@apple.Apple.COM> Sender: news@crdgw1.crd.ge.com Reply-To: oconnordm@CRD.GE.COM (Dennis M. O'Connor) Organization: GE Corporate R&D Center Lines: 46 In-reply-to: baum@Apple.COM (Allen J. Baum) baum@Apple (Allen J. Baum) writes: ] >CISCs may well take longer to design (or not), but the key issue is what ] >happens in the critical paths on the chip. From past history (i.e., things ] >like 360/91), you can make any architecture go faster, but if not designed ] >for smooth pipelining, the complexity can get very high. ] ] Bingo! I believe you've said something I believe strongly, and the ] crux is the "designed for smooth piplelining" phrase. I feel that this ] is really the major distinguishing feature between "RISC" & "CISC". A major illustrative example of this was the MCF architecture, developed by the military when DEC refused to license the VAX architecture to MIL-SPEC computer manufacturers. ( MCF was known as Nebula, also ) MCF was very similar to a VAX, but more so. It had recursive addresing modes, for instance : you could, in a single addres specification, specify something like ( M[x] = contents of memory location x ) [offset + M[ offset + M[ offset + M[ offset + register ] ] ] ] I kid you not. And with no limit on the level of nesting. Just think how easy (!?) this made compilation of high-level code constructs like rec_array( index_array( frame(2).index ).in_ptr ).rec_field( 2 ) ;-) Worse than than this, the instruction set was byte-quantized and variable length, and you couldn't tell how to decode a byte until all the previous bytes had been decoded. ( One method of solving this was to decode each byte all five possible ways and then select the correct decoding. ) The(dynamic) average instruction length was five bytes, so to achieve, say, 10 million instructions per second execution you had to decode 50 million bytes per second, one at a time. Yeesh. Designing a pipelined architecture for this beast was tough ( for example, the pipeline had a loop in the middle of it to handle the recursive addresing modes. ) A few changes to the architecture would have allowed it to run much more quickly. Apparently, this is what happens when a machine architecture is designed by ONLY the compiler people ( I guess ) with no input from the hardware people. The two must work together, IMHO :-) -- Dennis O'Connor OCONNORDM@CRD.GE.COM UUNET!CRD.GE.COM!OCONNOR Science and Religion have this in common : you must take care to distinguish both from the people who claim to represent each of them.