Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!gatech!udel!rochester!pt.cs.cmu.edu!MATHOM.GANDALF.CS.CMU.EDU!lindsay
From: lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay)
Newsgroups: comp.arch
Subject: Re: looking for >32-bit address space
Message-ID: <4651@pt.cs.cmu.edu>
Date: 5 Apr 89 18:18:31 GMT
References: <1032@myrias.UUCP> <650010@hpclscu.HP.COM>
Organization: Carnegie-Mellon University, CS/RI
Lines: 55

There are two important issues.
Issue number one, is the short-pointer problem.
Issue number two, is the mapped file problem.

Short pointers are simply an economy measure.  Reducing the number of
bits used may make programs smaller, reduce processor cycles, compress
data structures, and so on.  In the case of 32 bit machines with >32
bit virtual addresses, it's "obvious" that programs will deal in 32-bit
short addresses.

The problem is that there are now two "pointer" data types, not one,
and the programmer has to know that
	<small> := <big>
doesn't work.  Or, alternatively, the compiler can have a "large
model/small model" switch.  The programmer is off the hook, but now you
need two copies of every library .. and every application .. and the
"big" versions may be unnecessarily slow.

"Segment register" schemes don't really help, because now pointers are
only meaningful when the context is as it was when they were created.
A pointer is no longer standalone, and you can't have "magic cookies"
("opaque types").  So, any generalized routine that is passed a
pointer, must also be passed segment info, and we're back around to
short/long. Also, it may be impossible to build big objects out of
successive segments, in such a way that the "next address" operation is
just "increment". (I suppose a trap is perferable to inline overflow
checking.)


Issue number two, mapped files, is simply that there may be a strong
advantage to having Really Big Objects.  Database people would like to
map an entire external device into the address space.  Since 5 GB disk
drives already exist, 32 bit addresses already don't cut it.

Stonebraker argued that data base systems would like a flat view of a
disk, because otherwise they are imposing structure on top of the OS's
structures, and both layers do e.g. indexing that would be more
efficient if done by only one layer.  The Mach operating system has a
nice feature for this.  Mach allows an application developer to write
his own page fault handler, which gets executed in the same space as
his application.  (In theory, the other threads of the application
don't need to block unless they also need the same page.)

This data-base argument seems to me like a strong argument for
supporting big addresses. I don't see any really acceptable way to
compress big addresses, except perhaps in application-dependant ways.
For example, the DB's fault handler could use N high-order address bits
in some "take a ticket"/"tag bit" style. This only really extends the
address space if the application knows how to do things like ticket
reclamation.  (These things are difficult in general, but easy if the
application can be specifically designed for it.)


-- 
Don		D.C.Lindsay 	Carnegie Mellon School of Computer Science
--