Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!amdahl!nsc!voder!apple!bcase
From: bcase@Apple.COM (Brian Case)
Newsgroups: comp.arch
Subject: Re: 16 & 32 bit vs 32 bit only instructions for RISC.
Message-ID: <7538@apple.Apple.Com>
Date: 2 Mar 88 20:06:26 GMT
References: <2574@im4u.UUCP> <9740@steinmetz.steinmetz.UUCP>
Reply-To: bcase@apple.UUCP (Brian Case)
Organization: Ungermann-Bass Enterprises
Lines: 55

In article <9740@steinmetz.steinmetz.UUCP> sungoddess!oconnor@steinmetz.UUCP writes:
>There is NO intrinsic reason 16-bit instructions would decode slower than
>32-bit instructions. In fact, they can ultimately decode FASTER :
>the fewer bits your decoder has to look at, the faster it can be.
>Barring other complications of course.

The first statement is true.  However, 32-bit instructions typically have
more bits dedicated to the opcode than 16-bit instructions.  This allows
32-bit instructions to have less-dense encodings and therefore faster
decodings.  For a real lesson in this aspect, see the Stanford MIPS-X and
Original Berkeley RISC instruction encodings.  Yeow!  The instruction decode
logic is almost impossible to see on the MIPS-X.

>Dedicating 64 pins purely to instruction fetch (assuming a Harvard
>architecture) is quite a lot of a rather scarce resource. Sure
>you wanna do this on a micro ?

Sounds like it might be a good idea.  Note that instruction-only-bus pins
are INPUT-only; thus their corresponding pads are much "faster" and simpler
than bidirectional bus pads.

>The appropriate measure of cache size, IMHO, is in INSTRUCTIONS.
>Given you have some limited number of transistors to put into
>a cache, then the smaller your instructions are, the "bigger"
>your cache will be.

This is quite true.  However, see the following.

>Also, instruction size affects several "second-order" performance
>factors, like how quickly a program loads from a "low-speed" (like
>disk) I/O device and how often you page-fault. This effect
>is of course due to the fact that programs written in a 32-bit
>RISC instruction set will be (according to our data) 65% larger
>than the same program in a 16-bit RISC instruction set.

Yes, the second order effects (not affects) can be very important in
certain environments.  As to code size differences:  I have a very large
program (30K lines of C code) that compiles to about 300K bytes of Am29000
code.  On the VAX, it compiles to about 225K bytes (yes, that's with PCC
and -O turned on and the 29K compiler is a wonderful thing from MetaWare).
That's roughly 33%.  That seems typical, although code size ratios can be
anywhere from 1.1 to over 2.0.  However, these kinds of percentages are not
terribly important unless they are much closer to 2 to 3 times as big.  The
question is "what is the cache miss ratio is real life?"  This is NOT
necessarily directly related to general code size!  One kind of
optimization that will be very important (and I hope commonplace) is loop
unrolling.  Yes, the code size ratios will still roughly scale, but the
point is that we are talking about, to some degree, space/time tradeoffs.
If you want a little less time, you can usally get it by giving up a little
space.  This works to a point.  Clearly, you wouldn't want your instruction
format to have one-bit per register.

>Sorry, we haven't published our data yet. It's just an analysis
>(using information-theory) of existing data anyway.

I do hope you guys publish such data.  The more information the better!