Path: utzoo!mnetor!uunet!lll-winken!lll-tis!ames!ll-xn!oberon!bbn!rochester!PT.CS.CMU.EDU!andrew.cmu.edu!zs01+
From: zs01+@andrew.cmu.edu (Zalman Stern)
Newsgroups: comp.arch
Subject: Re: 16 & 32 bit vs 32 bit only instructions for RISC.
Message-ID: <kW=EuFy00Vs8QIKUcP@andrew.cmu.edu>
Date: 3 Mar 88 07:43:13 GMT
References: <9651@steinmetz.steinmetz.UUCP> <9678@steinmetz.steinmetz.UUCP>,
	<2574@im4u.UUCP>
Organization: Carnegie Mellon University
Lines: 51
In-Reply-To: <2574@im4u.UUCP>


I believe that 16 bit instructions can be a win. Good code density saves disk 
space as well as memory space. Besides, the more instructions you get in N 
bits, the more performance your cache and bus bandwidth buy you. The trick is 
to keep from designing a CISC in the process. Mainly, multiple instruction 
lengths and hairy bit encodings should be avoided like the plague.

I actually tried to design an architecture using 16 bit instruction words. 
Being a complete amateur, it turned out as a complete mess. I did learn the 
following though:

    You lose on 32 bit immediate operands. This can be a performance loss.
    You do not have room for multiple forms of load (i.e. signed/unsigned half 
word and byte)
    More than 16 registers is hard.
    Addressing modes don't fit. (As the RISC advocates say, "So what?")

The RPM40 gets around these by using prefix instructions to extend immediates, 
a separate size register to specify non-word format, and an asymmetric 
register set to give you 21 registers. (Only 16 of the registers are 
accessible from every register operand slot.) Addressing modes are unnecessary 
since you can use some of the instruction bits you save for address 
calculation instructions. Basically, slick solutions to the above problems.

A good argument for 32 bit instructions is the MIPS R2000. Given a really 
tense compiler, having a full 32 entry register file may be worth the extra 
bits. Also, I recall seeing some statistics that a significant number of their 
load instructions (measured dynamically) use 16 bit offsets. (Notably to index 
off the global variable pointer.) There are also some special privileged 
instructions on the R2000 that make things like software page fault resolution 
go like hell. I don't see much room in a 16 bit instruction for stuff like 
this. (Although you could steal some bits from the coprocessor instruction on 
the RPM40.) In any event, the R2000 seems geared for large, reasonably fast, 
memory subsystems. In this case, speed is worth some wasted bits. 

One thing that is really useful is to be able to take a 32 bit constant and do 
something with it in 64 bits worth of instruction stream. For example, loading 
a value from an absolute (32 bit) address. The IBM RT, the MIPS R2000, and the 
RPM40 can all do this. The RT and the R2000 use one 32 bit instruction to load 
the high half of a register and a load with offset instruction to finish the 
job. The RPM40 uses three prefix instructions and a load. As near as I can 
tell, the AMD 29000 requires 3 32 bit instructions to do this. I find this 
useful because compilers should (almost) always generate 32 bit addresses. 
When we first compiled Scribe on an RT, it died during the link phase. Turned 
out that there was more than a megabyte of executable code and the compiler 
had chosen an instruction with a 20 bit absolute address... (This has long 
since been fixed.)

Sincerely,
Zalman Stern
Internet: zs01+@andrew.cmu.edu     Usenet: I'm soooo confused...
Information Technology Center, Carnegie Mellon, Pittsburgh, PA 15213-3890