Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!cs.utexas.edu!usc!apple!baum
From: baum@Apple.COM (Allen J. Baum)
Newsgroups: comp.arch
Subject: Re: Why The Move To RISC Architectures?  ('386 vs. RISC)
Message-ID: <39746@apple.Apple.COM>
Date: 22 Mar 90 23:38:49 GMT
References: <28012@cup.portal.com> <289@emdeng.Dayton.NCR.COM>
Reply-To: baum@apple.UUCP (Allen Baum)
Organization: Apple Computer, Inc.
Lines: 51

[]
>In article <289@emdeng.Dayton.NCR.COM> hrich@emdeng.UUCP (George.H.Harry.Rich) writes:
>.In article <28012@cup.portal.com> Will@cup.portal.com (Will E Estes) writes:
>.>architectures really tell you anything of worth?
>...
>.>
>.>Finally, why is everyone so excited about RISC?  Why the move to
>.>simplicity in microprocessor instruction sets?  You would think
>.>that the trend would be just the opposite - in order to increase
>.> the speed of very high-level instructions by putting them in silicon
>

Actually, the problem with cmoplex stuff is that it isn't used, so why put
it in. The higher the semantic content, the less often it is used. RISC
attempts to put the highest semantic content in that gets used a lot- which
isn't very high, it turns out.

>First of all, what you save on a complex instruction versus several simple
>ones is the fetch and decode time.  If the processor has good prefetch and
>caching what you are generally talking about is decode time.  However,
>a really simple instruction set takes less time to decode,

Yes, but if your critical paths are not decode related, then it just doesn't
matter. Reducing critical paths (both in hardware, where it is generally 
load/store or branch related, and software, which is '# of inst.s to perform
some function'.
CISCs attempt to reduce the second (software) factor. Unfortunately, they
often do this by increasing the first, and they can't do it often enough
to make up for this.
You can make instructions that perform the same actions as a series of simpler
instructions. I can make n^i variations of the latter, and few variations of
the former. Experience has shown that lots of variations get used, especially
after optimization, so that it is impossible to pick a small set of complex
insts. that get used enough to make them worthwhile. Besides, these complex
insts. often get executed as a series of microsteps, and often go no faster
than the series of simple instructions. Finally, it is possible to re-arrange
the order of the simpler ones to avoid interlocks, which can't happen inside
a complex instruction.

On the flip side, complex instructions can run a deeper pipeline. If the
instructions can truly be piped (a very big if, when interlocks are taken
into account), then this is equivalent to a cheap 'superscalar' implementation.

For example, a series of "Add Mem to Reg" instructions, which can be piped at
one per cycle, will run twice as fast as the simpler "Load Mem to Reg", "Add
Reg to Reg" series. The pipeline is more complex, but is simpler than the
full superscalar implementation. The question is, with good register allocation
does it happen enough to make it worthwhile?
--
		  baum@apple.com		(408)974-3385
{decwrl,hplabs}!amdahl!apple!baum