Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site calgary.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!ihnp4!alberta!calgary!radford
From: radford@calgary.UUCP (Radford Neal)
Newsgroups: net.arch
Subject: Strange architecture proposal
Message-ID: <968@calgary.UUCP>
Date: Mon, 11-Feb-85 23:06:21 EST
Article-I.D.: calgary.968
Posted: Mon Feb 11 23:06:21 1985
Date-Received: Wed, 13-Feb-85 03:00:07 EST
Distribution: net
Organization: University of Calgary, Calgary, Alberta
Lines: 94


Any comments on the following idea?


Take a machine with bytes, words, longs, etc. - all of sizes which are
a power of two of some basic unit - and which requires such data to be
aligned, with the address of a unit of size 2**N being required to have
N low-order zero bits.

Such a machine presumably has instructions for handling the different
data units - e.g. MOVEB, MOVEW, MOVEL, etc. Let's say there are no
registers, so a MOVE instruction always requires two memory addresses
to be specified in some fashion.


Here's the idea: Get rid of the MOVEB, MOVEW, MOVEL, etc. instructions
and replace them with a single MOVE instruction, with the type of 
unit to be moved determined by the memory address given. Addresses
have a new format, as for instance:

      0 0 1 A8 A7 A6 A5 A4 A3 A2

This example specifies a long operand (four bytes) with address

      A8 A7 A6 A5 A4 A3 A2 0 0

In general, the size of an operand is 2**N, where N is the number of
leading zeros in its address. The "actual" address is the bits after
the highest-order one bit, padded with zeros to make it come out the
the right size. In this scheme, the address

      0 0 0 0 0 0 0 0 0 1

addresses all of memory as a single block (assuming the scheme has been
carried to this extreme).


Notice that this format requires only a single extra bit in addresses
to encode the size of the object pointed to. This is possible because
traditional addresses on aligned machines waste information space, as
evidenced by the existence of illegal addresses (e.g. odd word addresses).

Notice also that this scheme is different from a "tagged" architecture
in that the size is a property of the POINTER, not of the data. If you
want to start looking at your integer as a series of bytes, you need only
shift your pointer left a bit, you don't have to change what's stored in
memory.

In addition to getting rid of all the various forms of MOVE instruction,
this scheme also gets rid of all the conversion instructions - moving
from a byte address to a long address automatically zero-extends. At most
you need two instructions, MOVEU and MOVES, to allow you a choice of
zero-extend or sign-extend.

Also, there is no need for "stride adjustment" with statements such as

    int *p;
    p += 1;

Adding one to an address always gets you the next element, regardless
of the size of the data type pointed to.


Finally, this scheme allows one to write more general subroutines than
one can without it, e.g. something like:

    long max(a,n)
      anyinttype a[];
      int n;
    { long r; int i;
      r = a[0];
      for (i = 1; i<n; i++) if (a[i]>r) r = a[i];
      return;
    }

should compile to efficient code which will find the maximum of an array
of bytes, words, or longs.


I can see a few problems here myself. A minor one is figuring out what
to do with registers - a simple register number won't contain the 
size information. The register numbers could be changed in the same 
fashion as the memory addresses to encode the size, or the registers could
be treated specially, always using all their bits for instance.

Another problem is that the shifting required to get the "real" address
might slow down the processor too much. Any ideas on how much?

Finally, any programs which actually used the flexibility illustrated 
by the "max" function above would become hopelessly un-portable to a
standard architecture.

    Radford Neal
    The University of Calgary