Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!iuvax!bobmon
From: bobmon@iuvax.cs.indiana.edu (RAMontante)
Newsgroups: comp.binaries.ibm.pc.d
Subject: Re: RISC vs CISC
Message-ID: <30203@iuvax.cs.indiana.edu>
Date: 22 Nov 89 16:35:49 GMT
Reply-To: bobmon@iuvax.cs.indiana.edu (RAMontante)
Distribution: comp.binaries.ibm.pc.d
Organization: malkaryotic
Lines: 34

bwwilson@lion.waterloo.edu (Bruce Wilson) <18323@watdragon.waterloo.edu> :
-
-If the RISC instructions are twice as fast but require twice as
-many, what is gained?

The speedup comes from (at least) three things:

Number of cycles used to perform a function.  Years ago it was noticed
on some VAXen that some operations which had their own complex
instruction for just the purpose, could nonetheless be done faster in a
subroutine that used only simple instructions (I don't remember the
detailsof the instruction).  The simple instructions were faster,
because they needed fewer clock cycles to decode and execute.  The
number of clock cycles needed to set up the complex instruction
outweighed the number used in subroutine overhead....  Since the
specialized instructions aren't commonly used anyway, you can make a
strong case for keeping them out of the cpu (which simplifies its design
enormously) and putting lots of sweat into doing them right in a
coprocessor if they're a really good idea (e.g.  trig functions).

Clock rate.  The simpler cpu design allows circuits to be packed into a
smaller area.  This means that signals don't have to travel as far ---
one of the approaching hard limits on cpu speed is the "speed of light",
i.e. signal propagation in a wire.  Also, simpler instructions can be
decoded and executed using fewer circuit elements (gates).  Since each
gate involves a gate delay, and the clock period must be long enough to
span the longest set of gate delays, reducing these allows a faster clock.

Cache, pipeline, etc.  Since the cpu design is *smaller*, there is more
real estate available on the chip for laying out cache memory,
pipelines, and so forth, all of which speed things up.  Another choice
made by one design is to put in *lots* of registers; when the processor
context-switches, it doesn't need to save the register state on a stack
in memory, it just moves on to a different set of registers on the chip.