Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!apple!oliveb!bbn!bbn.com!slackey
From: slackey@bbn.com (Stan Lackey)
Newsgroups: comp.arch
Subject: Re: More RISC vs. CISC wars
Message-ID: <42621@bbn.COM>
Date: 12 Jul 89 14:55:13 GMT
References: <42550@bbn.COM> <13982@lanl.gov>
Sender: news@bbn.COM
Reply-To: slackey@BBN.COM (Stan Lackey)
Organization: Bolt Beranek and Newman Inc., Cambridge MA
Lines: 45

In article <13982@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>From article <42550@bbn.COM>, by slackey@bbn.com (Stan Lackey):
>< [...]
><>I don't
><>know of any CISC machines with 'hardwired' instruction sets.  Micro-
><>coding slows the machine down.
><
>< This is an interesting statement.  As I recall hearing, Cray started 
>< this perception back in the 70's.  I thought it had been proven wrong.
>
>And, how many microcycles does 'one cycle' on the Alliant correspond
>to?  

One.  The reason many, even memory-to-register, operations take one
microcycle is because it has a scalar pipeline.  Even though pipelines
"can't-be-done" on CISC's.

The cycle time is fairly long, 170ns, but that was typical for when
the machine was designed, 1983.  Cycle time was set by
cache/memory/bus tradeoffs, and by the register read-modify-write time
you could get with CMOS gate arrays of that era.  Had nothing to do
with instruction decode, which is done in parallel with other
operations in the first and second pipeline stages.  Microcode access
time is done in parallel with normal address calculation time.

Note that CMOS has gotton like 3 times faster since then.

>compilers for CISCs don't use all those extra
>instructions anyway.  Seems like a good idea to get rid of them and
>speed up the machine!

The Alliant compiler really really does use the memory-to-register
operations, auto-inc/dec addressing modes, vector instructions, and
concurrency instructions.  All to advantage.

>Alliant is obviously fairly slow, since it can do something to an
>arbitrary memory location in one cycle.  The cycle time is aparently
>longer than the memory delay time.

As I hope I clarified above, the pipeline allows a very long sequence
of operations, including a memory access, to consume effectively one
cycle of execution time.  Specifically, memory-to-register floating
point takes six cycles from front to back, but with the pipeline
really consumes only one cycle.
:-) Stan