Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!know!zaphod.mps.ohio-state.edu!mips!winchester!mash
From: mash@mips.COM (John Mashey)
Newsgroups: comp.arch
Subject: Re: 64 bits--why stop there?
Message-ID: <41004@mips.mips.COM>
Date: 21 Aug 90 21:18:25 GMT
References: <5539@darkstar.ucsc.edu> <13285@yunexus.YorkU.CA> <30728@super.ORG> <9660@ganymede.inmos.co.uk> <224@csinc.UUCP> <1263.26cdaecc@waikato.ac.nz> <6106@vanuata.cs.glasgow.ac.uk> <2437@crdos1.crd.ge.COM>
Sender: news@mips.COM
Reply-To: mash@mips.COM (John Mashey)
Organization: MIPS Computer Systems, Inc.
Lines: 97

In article <2437@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:

>  While we're all talking about 64 bits, where is it writ' that word
>size shall be a power of two bits? Outside of the prevalence of the
>eight bit byte, is there a good technical reason for it? Certainly the
>old Honeywells I used to use, with their lovely nine bit bytes and 36
>bit words... Conversion to IBM was *REAL* painful. 
 ....
>  Now this was goping from 36 to 32 bits. If four bits can hurt that
>much, how much would 48 bit help, instead of 64? I know there are some
>48 bits machines made (or were), what gains were there?
.....
>  Maybe some of the chip designers can give an idea of what the cost
>ratio is for 48 vs 64 bits. Obviously there can be a lot of new
>applications tackled with 48 bits, is the saving over 64 significant, or
>should we expect a leap to 64? 48 bits soon is more useful than 64 bits
>eventually, perhaps.

1) As bill notes, computing history is filled with machines that have/had
word sizes that weren't powers of two. Some examples include:
36:	IBM 7090, DEC PDP-10, GE 635, Univac 1108
48:	Burroughs B5000
51:	Burroughs B6700
60:	CDC 6600
(and of course, the successors to these)
And of course, there were plenty of minis with 12, 18, or 24.
If you look around, you can probably find that somebody has built some
machine somewhere with almost any number of bits/word from 8 to 64, especially
given that tagged-architecture machines often have unusual sizes.

2) However, at this point, most people build general-purpose machines with
power-of-two wordsizes, and it seems likely that this will continue,
with the possible exception that tagged-architecture machines might
have power-of-two space for data, plus bits for tags.
Why?
	These days, you would have to think long and hard before creating
	a new general-purpose architecture to which it is difficult to:
	port C
	port UNIX
	port FORTRAN, COBOL, PL/1, PASCAL, etc, etc.
and
	you would want to think real hard before introducing archtitectures
whose character-size is inconvenient and poorly matched with existing
peripherals, support chips, etc.

Note: I did not say you'd never do this, I just said you'd better have
pretty good reasons for it.

3) Note that with 8-bit chars, 16-bit shorts, and words of 32 or 64,
addressing is simple (low order bits select sub-unit within the word),
everything is packed 100% full, and there are no weirdly-special pointers
to different kinds of objects.

3) Unfortunately, 48 doesn't work very well under these circumstances:
	Assume char = 8 bits, short = 16, and int = 48.
	6 chars/word, 3 shorts.  Ugh.
Now, you get several very unpleasant choices:
	a) The machine is byte-addressed, and to obtain the address of
	the word containing a byte, you get to divide by 6, something
	hardware designers show scant enthusiasm. :-)
	b) The machine is word addressed, with some kind of special byte
	pointer (the solution adopted by most of the non-zero-power of
	two machines).  A typical mechanism would use the low-order
	3 bits to select the byte within the word, with special string
	instructions that increment the word address, and reset the byte
	number to 0, whenever the byte count exceed the number of bytes.
	Likewise, you will probably do something for shorts.
	In this case, the hardware folks may be ahppier, but the compiler
	people are not.  C has certainly been ported to such machines,
	and so have many UNIX commands, so it is possible. But it is
	not fun, and even worse, if you'd like to get lots of third-party
	software, things will not be so easy. (People may recall that
	the Stanford MIPS used word-addressing with byte pointers, whereas
	none of the MIPS Computer Systems chips do so....there's a reason :-)

4) Well, maybe 8-bit bytes are bad, and a 48-bit machine should have
12-bit bytes, 24-bit shorts.  This is probably easier for porting software,
but there will still be problems. it will be easier to make the code on
a single machine consistent, but it will be worse talking to the outside
world.  Networking code will be exciting, and you're on your own when it
comes to busses, periperhal chips, SIMMs, etc.  Finally, 12-bit bytes have the
awkwardness of using 50% more space than 8-bit ones, without even having
the advantage of improving language coverage much (i.e., as for some
Asian languages that really need about 16 bits/character, or more).

SUMMARY:
1) Software inertia strongly impels people to build machines whose
words contain 2**n bytes, for C especially, but also for other languages.
2) (Some) software inertia and (much) hardware inertia impels people
to use 8-bit characters.
3) So, I'd be amazed if a new general-purpose architecture would likely
be viable at 48 bits.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086