Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!crdgw1!CRD.GE.COM
From: oconnordm@CRD.GE.COM (Dennis M. O'Connor)
Newsgroups: comp.arch
Subject: Re: CISC Silent Spring
Message-ID: <5182@crdgw1.crd.ge.com>
Date: 9 Feb 90 15:35:34 GMT
References: <3300098@m.cs.uiuc.edu> <771@sce.carleton.ca> <35456@mips.mips.COM> <25cb6b65.702c@polyslo.CalPoly.EDU> <7826@pt.cs.cmu.edu> <3562@odin.SGI.COM> <35647@mips.mips.COM> <38462@apple.Apple.COM>
Sender: news@crdgw1.crd.ge.com
Reply-To: oconnordm@CRD.GE.COM (Dennis M. O'Connor)
Organization: GE Corporate R&D Center
Lines: 46
In-reply-to: baum@Apple.COM (Allen J. Baum)

baum@Apple (Allen J. Baum) writes:
] >CISCs may well take longer to design (or not), but the key issue is what
] >happens in the critical paths on the chip.  From past history (i.e., things
] >like 360/91), you can make any architecture go faster, but if not designed
] >for smooth pipelining, the complexity can get very high.
] 
] Bingo! I believe you've said something I believe strongly, and the
] crux is the "designed for smooth piplelining" phrase. I feel that this
] is really the major distinguishing feature between "RISC" & "CISC".

A major illustrative example of this was the MCF architecture, developed by
the military when DEC refused to license the VAX architecture to MIL-SPEC
computer manufacturers. ( MCF was known as Nebula, also )

MCF was very similar to a VAX, but more so. It had recursive addresing
modes, for instance : you could, in a single addres specification,
specify something like ( M[x] = contents of memory location x )

[offset + M[ offset + M[ offset + M[ offset + register ] ] ] ]

I kid you not. And with no limit on the level of nesting. Just
think how easy (!?) this made compilation of high-level code
constructs like 
   rec_array( index_array( frame(2).index ).in_ptr ).rec_field( 2 )
;-)

Worse than than this, the instruction set was byte-quantized and
variable length, and you couldn't tell how to decode a byte until all
the previous bytes had been decoded. ( One method of solving this was
to decode each byte all five possible ways and then select the correct
decoding. ) The(dynamic) average instruction length was five bytes, so to
achieve, say, 10 million instructions per second execution you had to decode
50 million bytes per second, one at a time. Yeesh.

Designing a pipelined architecture for this beast was tough ( for
example, the pipeline had a loop in the middle of it to handle
the recursive addresing modes. ) A few changes to the architecture
would have allowed it to run much more quickly.

Apparently, this is what happens when a machine architecture is
designed by ONLY the compiler people ( I guess ) with no input
from the hardware people. The two must work together, IMHO :-)
--
  Dennis O'Connor      OCONNORDM@CRD.GE.COM      UUNET!CRD.GE.COM!OCONNOR
  Science and Religion have this in common : you must take care to
  distinguish both from the people who claim to represent each of them.