Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!husc6!rice!sun-spots-request
From: brent%sprite.Berkeley.EDU@ginger.berkeley.edu (Brent Welch)
Newsgroups: comp.sys.sun
Subject: RISC versus CISC
Message-ID: <8902081650.AA270636@sprite.Berkeley.EDU>
Date: 14 Feb 89 06:24:07 GMT
Sender: usenet@rice.edu
Organization: Sun-Spots
Lines: 43
Approved: Sun-Spots@rice.edu
Original-Date: Wed, 8 Feb 89 08:50:54 PST
X-Sun-Spots-Digest: Volume 7, Issue 152, message 2 of 9

The standard RISC argument is that you spend less chip area on complicated
instructions (i.e. microcode) so you can make the cycle time shorter, and
you can spend chip area on things like register windows so that procedure
calls go faster.

Register windows are a set of overlapping registers, say 64 registers
total divided into 7 or 8 overlapping sets of 16.  At any one time only 16
registers are visible.  At procedure call you bump the "window pointer" by
8 to make another partially overlapping set of registers visible.  This
means you can use the overlapping part to pass procedure arguments very
fast;  what were the callers local registers are the callee's input
registers after the window is shifted.  If you call very deep the system
has to trap and spill windows, or do the converse when you unwind.
Ordinarily, however,  you can save many main-memory references with this
technique.

Also, you can make instruction decoding go faster because there are fewer
different instructions, and fewer addressing modes.  Typically all ALU
operations are register-to-register, and memory references are through
explicit load/store instructions.  You also do pipelining and enforce some
contraints so that you get nearly one instruction completed per cycle,
even though it takes say 4 cycles in the pipeline for a single
instruction.  (Compare this with the cycle counts given in a 680x0 manual.
2-10 cycles/simple instruction.) The constraints on instruction sequences
include "delayed branches" where the instruction after a branch always
gets executed.  This delay slot can be filled by clever compilers.
Sometimes there are also restrictions on using the result of one
instruction as the input to the next instruction because of the pipeline.
I think SPARC uses internal forwarding so this restriction doesn't apply.

Perhaps a final argument is that because the chip is less complicated you
can put it onto newer, faster technology more easily.  The first SPARC
chips were made on gate-arrays, for example, and a gallium arsinide chip
has been promised.

Ulitimately, however, I'm not sure there has been a truely fair
head-to-head competition between a RISC and a CISC.  You'd have to use the
same technology, the same processor cache, the same memory bandwidth, the
same applications, and the best compilers for both.

	Brent Welch
	University of California, Berkeley
	brent%sprite@ginger.Berkeley.EDU