Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!mp.cs.niu.edu!bennett
From: bennett@mp.cs.niu.edu (Scott Bennett)
Newsgroups: comp.sys.next
Subject: Re: RISC vs. CISC -- SPECmarks
Message-ID: <1991Apr25.025800.4377@mp.cs.niu.edu>
Date: 25 Apr 91 02:58:00 GMT
References: <1991Apr15.165540.14270@agate.berkeley.edu> <1991Apr22.044553.16805@mp.cs.niu.edu> <1991Apr24.170804.25670@kithrup.COM>
Organization: Northern Illinois University
Lines: 57

In article <1991Apr24.170804.25670@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
>In article <1991Apr22.044553.16805@mp.cs.niu.edu> bennett@mp.cs.niu.edu (Scott Bennett) writes:
>>     Case in point.  By executing multiple complex instructions per cycle,
>>the CISC would appear to benefit at least as much as the RISC, not as you
>>state.
>
>Nope, not really. That's the problem:  with a "CISC" instruction set, it's
>really very difficult to go superscalar, at least compared to some "RISC"
>instruction sets.  Why?  Because not enough registers, too many memory
>references in a single instruction, or other small, niggling details.

     If you disallow pipelining in the CISC machine, then it is most
likely to be impossible to have so-called superscalar operation.  However,
most CISC machines now are not only pipelined, they are *multiply* pipe-
lined.  Since a superscalar RISC can only be that way by pipelining,
let's at least compare only pipelined architectures.  FWIW, the MC68040
supposedly averages about 1.3 clock cycles per instruction because of
the pipelining used.  That obviously doesn't reach "superscalar", but
it isn't terribly far off, either.
     In any case, what really matters is how much work gets done per
clock cycle, not how many instructions get done per cycle.  One example
is the case of moving blocks of data from one memory location to another.
A typical RISC must 1) initialize a loop (one or more instruction fetch/
decodes) and in the body of the loop must 2) load a word into a register
(one fetch/decode), 3) store from the register into the new location (one
fetch/decode), 4) increment both addresses (probably two fetches/decodes),
5) loop back to repeat until finished (at least one fetch/decode).  Some
CISCs have something like a "repeat" instruction that will execute 
another instruction (e.g. a storage-to-storage move) a given number of
times while incrementing addresses in that instruction, so the whole
operation may require as few as two fetches/decodes.  Other CISCs have
single instructions capable of doing block moves, so they only need one
fetch decode.  That means more of the cycles required get spent doing the
actual work that needs to be done than would be the case with a RISC.  A
CISC operating in such a way would be at the *opposite* end of the spectrum
from "superscalar", but would get its work done more quickly anyway.
>
>-- 
>Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
>sef@kithrup.COM  |  I had a bellyache at the time."
>-----------------+           -- The Turtle (Stephen King, _It_)
>Any opinions expressed are my own, and generally unpopular with others.


                                  Scott Bennett, Comm. ASMELG, CFIAG
                                  Systems Programming
                                  Northern Illinois University
                                  DeKalb, Illinois 60115
**********************************************************************
* Internet:       bennett@cs.niu.edu                                 *
* BITNET:         A01SJB1@NIU                                        *
*--------------------------------------------------------------------*
*  "Spent a little time on the mountain, Spent a little time on the  *
*   Hill, The things that went down you don't understand, But I      *
*   think in time you will."  Oakland, 19 Feb. 1991, first time      *
*  since 25 Sept. 1970!!!  Yippee!!!!  Wondering what's NeXT... :-)  *
**********************************************************************