Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!yale!mfci!colwell
From: colwell@mfci.UUCP (Robert Colwell)
Newsgroups: comp.arch
Subject: Re: CISC Silent Spring
Message-ID: <1228@m3.mfci.UUCP>
Date: 9 Feb 90 14:35:11 GMT
References: <3300098@m.cs.uiuc.edu> <771@sce.carleton.ca> <35456@mips.mips.COM> <25cb6b65.702c@polyslo.CalPoly.EDU> <7826@pt.cs.cmu.edu> <3562@odin.SGI.COM> <35647@mips.mips.COM> <51951@bbn.COM>
Sender: colwell@mfci.UUCP
Reply-To: colwell@mfci.UUCP (Robert Colwell)
Organization: Multiflow Computer Inc., Branford Ct. 06405
Lines: 74

In article <51951@bbn.COM> slackey@BBN.COM (Stan Lackey) writes:
>In article <35647@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>>3) Exception-handling is always one of the most trouble-prone areas of
>>a design, and anything that makes it more complex slows down the design
>>process.
>
>Microcode is the way this problem is commonly dealt with.  Microcode
>turns an intractable hardware control mechanism into a part of the
>design that many a computer hardware or software person can
>understand, design, debug, etc.  Because exception handling must be
>dealt with in hardware in a RISC, one could make the claim that this
>makes RISCs more complex (from a certain point of view) than CISCs.

One could also dispute that claim.  Exception handling in normal 
machines (meaning those lacking the hard-real-time limit that incoming
missiles pose) don't deserve special hardware attention.  Give the
software as much as it needs to clean up the mess.  Anything more
increases the hardware design time and the likelihood that something
will have to be respun to fix bugs.  Sure, now I've moved that
complexity into software, and there it will still have to be dealt
with.  But I don't know of any machines that were late because the
software implementing their exception handlers weren't ready, and I
can think of lots of examples for complexity-related bugs delaying
hardware.

>Having more registers really does help some of the time, and when
>the industry starts making new CISC architectures I'll bet you will
>see more, now that program size is not so constraining.

You need as many registers as it takes for spilling and restoring 
them to stay off your list of bottlenecks.  This is a fairly 
complicated function of the number of functional units, their
respective latencies, the bandwidth available (and needed) to
and from memory, and the cleverness of the compiler.  Not a
RISC/CISC issue at all (which we first pointed out in 1983).

>>WILL THEY <CISC> CATCH UP?
>>	No:
>>	Intellectual complexity.
>>	Longer design cycles.
>>	Less registers than match current global optimizers.
>Vector machines always run faster on vector problems than non vector
>machines.  Even if the cycle time is a little slower.

"Always" is a tad strong.  If you're talking about 100% vectorizable
code I suppose you're right, but there isn't much of that around.
It certainly doesn't constitute the workloads of the customers
and benchmarks that we routinely run across.  For anything less
I believe vector machines are yesterday's answer to the problem.

>The shoe is moving to the other foot, so to speak; in order to match
>vector machines, RISCs will need to go to super scalar execution
>(assuming they don't add the large register sets or the instructions
>to do vectors).  To do this they need to deal with variable length
>instructions (variability determined by register dependencies and
>stuff in the pipe, not to mention the surrounding instructions),
>register and opcode fields in variable places in the instruction word,
>complexity handling exceptions, and all the other CISC characteristics 
>RISCers love to bash.

We solved this in Multiflow's machines without resorting to any of that.
Number of registers and memory bandwidth scale with the number of
functional units.  Instruction variability is at the packet (32-bit 
instruction word) level; a packet is present or it is not, and the 
cache miss hardware looks at a "mask" word to decide.  This allows us 
to do cache refill at full memory bandwidth without the refill engine 
having to even see any of the packets -- they just get blasted into 
icache directly.  And since they're fully decoded already, we get the 
RISC benefit of simple, fast instruction decode.

Bob Colwell               ..!uunet!mfci!colwell
Multiflow Computer     or colwell@multiflow.com
31 Business Park Dr.
Branford, CT 06405     203-488-6090