Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!rpi!batcomputer!cornell!huff
From: huff@svax.cs.cornell.edu (Richard Huff)
Newsgroups: comp.arch
Subject: RE: Inverted Page Tables
Message-ID: <37877@cornell.UUCP>
Date: 28 Feb 90 04:56:12 GMT
Sender: nobody@cornell.UUCP
Reply-To: huff@cs.cornell.edu (Richard Huff)
Distribution: comp
Organization: Cornell Univ. CS Dept, Ithaca NY
Lines: 38

Someone suggested that Mach handles large sparse address spaces
efficiently.  But Mach's architecture independent pmap module simply
implements an inverted page table in software; although their IPT's are
on a per process basis, rather than using a single IPT for the entire
machine.  So I still see IPT's as the only way to manage huge virtual
address spaces.

Now here's an interesting research problem:  What is the most efficient
way to manage large virtual address spaces on NUMA (non uniform memory
access) shared memory MIMD machines?  Can we extend the IPT approach in
a clean way?  I'd like to handle a TLB fault to a local physically
present page without referencing non local memory.  Ok, I could do that
with a local IPT approach.  But what if the page is physically present
on another processor's node?  How can I determine which node to talk to,
WITHOUT utilizing a local data structure that grows with the number of
nodes in the system?  Remember, I want a SCALABLE solution.  And we
presumably don't want a virtual address to always be associated with a
fixed node; so the processor number can't be encoded in the address
itself.  Besides, I might want to let the OS move pages around between
nodes for maximum locality.  In this case, we might view local memory as
simply a page cache for the larger global address space.

Will TLB faults be rare enough for me to simply maintain a master page
directory for the entire machine, distributed across all of the nodes,
that only gets used when the local IPT comes up empty?  Should we use
multiple distributed IPT's, say, one per "cluster" of nodes?

What is currently being done, or being proposed, for such large NUMA
MIMD machines?  How does the Butterfly II, Ultracomputer, or RP3 do
virtual to physical address translation?  Do they employ a separate
virtual address space per process?  Is it 32-bits, or larger?

Is anyone out there considering building a NUMA MIMD shared memory
machine with a single, machine wide, 64 bit virtual address space?


Richard Huff
huff@svax.cs.cornell.edu