Path: utzoo!attcan!uunet!cbmvax!daveh
From: daveh@cbmvax.commodore.com (Dave Haynie)
Newsgroups: comp.sys.amiga.hardware
Subject: Re: RISC Amiga WHat's RISC?
Message-ID: <15798@cbmvax.commodore.com>
Date: 12 Nov 90 16:16:32 GMT
References: <1178@iceman.jcu.oz> <w29yR3w163w@valnet> <39367@ut-emx.uucp> <15759@cbmvax.commodore.com> <yorkw.658182081@pasture.ecn.purdue.edu>
Reply-To: daveh@cbmvax.commodore.com (Dave Haynie)
Organization: Commodore, West Chester, PA
Lines: 78

In article <yorkw.658182081@pasture.ecn.purdue.edu> yorkw@pasture.ecn.purdue.edu (Willis F York) writes:
>Well Here's a Novice Question.

>I know RISC means Reduced INstruction Set Computer (Or Close)

>But what does this mean?
>The CPU chip has Fewer Commands (oops Instructions) that it knows how 
>to run? What's the big deal about that? OR an it to those few REAL fast?

Lots of folks argue about what RISC is, mainly those with nothing better to
do.  RISC really isn't any one thing, it's kind of a bannerhead for a set of
tools you might term "the late 80s approach to microprocessor design".  This
toolkit takes into account many ideals, but the main point seems to be that
one should be processing at least one instruction every clock cycle.  To this
end, you have:

	- Simplified Instruction Set
	  This doesn't always mean that the instructions do less work than
	  those of a CISC machine.  For example, most RISC chips have three
	  operand arithmetic functions, versus the two operand equivalents
	  in chips like the 68030.  The basic idea is that the instruction set
	  should be very orthogonal, stick mainly to simple instructions that
	  can be executed in a single cycle, and don't support zillions of
	  different addressing modes.  Once you have such an instruction set,
	  the computer design is simpler and the set can be implemented fully
	  hardwired, rather than via microcode as in the 68030.  
	- Load/Store Architecture
	  Another RISC tenet is that touching main memory is the only 
	  operation likely to take more than one clock cycle.  So you isolate
	  that operation -- only load and store instructions are capable of
	  touching memory, all others work between registers.  Along with this,
	  you'll find that RISC chips tend to have more registers, from 32
	  up to a hundred or more.
	- Pipelining
	  In order to keep roughly one instruction per clock running, RISC
	  designs tend to be heavily pipelined.  You have several stages of
	  execution for each instruction.  So, while a single instruction
	  may actually take 6 clock cycles to pass through the whole machine
	  pipeline, there should be one instruction in each of these 6 slots
	  at all times, therefore yielding an effective 1 clock/instruction.
	  Any code that takes more than this, such as a load or store, will
	  cause a pipeline stall, an unused slot or two in the pipeline.  The
	  long pipeline causes some strange coding practices.  On many RISC
	  chips, even accessing the same register in consecutive instructions
	  will cause a pipeline stall, since the register hasn't quite been
	  written in I0 by the time it need to be read by I1.  So smart 
	  compilers are called for, which can manage register allocations.

There are a few more concepts, but these are the basic ones.  Most of the ideas
that are in today's RISC devices come, either directly or a little roundabout,
from the supercomputer work done by folks like Cray.  And many of the same
reasons are present, only at the chip level.  Optimizing the size of the CPU
design means many fewer gates.  So you can use a faster process technology,
if available, than the folks building CISCs.  Or you can add much more on-chip
cache in the same process technology.  Or you can make the thing yield like
crazy and drop the price relative to a CISC device.  

There really isn't anything in RISC you can't apply to CISC, for the right 
price.  The 68040 is a good example.  The most common 680x0 instructions in
that device are hard wired rather than microcoded.  It has a deep pipeline,
with even a few innovations over most RISCs (for example, address register
increments and decrements, as well as offsets, get resolved in their own
pipeline stage with their own ALU, so these addressing more don't add time
to the instruction execution).  The 68040 also has a large on-chip cache,
which unlike the caches in most other chips, responds in a single cycle for
cache hits, making it nearly as fast as register access.  On the downside, all
this extra logic has made the 68040 take up 1.2 million devices in a 0.8
micron CMOS process.  You aren't going to see this move into GaAS or ECL any
time soon, whereas the MIPS folks already have an ECL version of their MIPS
architecture, the R6000.  And while 68040s will most likely be in the several
100 $ range for some time, you can get RISC chips at or near the same 
performance level for $100, maybe even a little less.


-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	Standing on the shoulders of giants leaves me cold	-REM