Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!pesnta!hplabs!hpda!hpisoa2!hpitg!peora!jer@peora
From: jer%peora@peora.UUCP
Newsgroups: net.arch
Subject: Re: Re: How Many Virtual Spaces (minor
Message-ID: <2120@peora>
Date: Mon, 28-Apr-86 18:03:00 EDT
Article-I.D.: peora.2120
Posted: Mon Apr 28 18:03:00 1986
Date-Received: Sun, 11-May-86 15:46:40 EDT
References: <5100056@ccvaxa>
Lines: 49

> Seems to me that you've worked around to one single virtual address space.
> You've got a fixed number of address bits for your address within a process,
> plus a fixed number of address bits for the process number => one bigger
> virtual address with a fixed number of bits.
>
> What you've proposed is segments, with a one-to-one correspondence between
> segments and processes.

It was my intent to work around to a single virtual address space!  Because
that was what you were asking about.

However, note that this address space has one unusual property, viz., that
if the "process number" bits are all zero (or some other unique value,
but let's say zero WLOG) then you get your "own" process's address space,
just as if you'd filled in your process number there.

I *think* this provides what you concluded you require, viz., a way for
the code to tell "what processor it's running on" (which I am thinking you
meant "what process it's running in behalf of" -- clearly in a symmetrical
multiprocessor the code doesn't have to know what processor it's running
on at all, except for the small amount of code that actually schedules
processes onto processors).

The issue of "where to put the caches" also applies to "where to put the
address translation hardware".  Since such hardware tends to be slow, you
get better performance if you put it with the processors, so that you have
a number of them working in parallel.  On the other hand, you then have
your basic consistency problem (a favorite topic of mine since my research
back in graduate school involved a model of memory where data objects,
rather than memory locations, had names, in order to avoid this problem),
i.e., keeping multiple copies of what are really the same object consistent.

On the other hand, you can eliminate this problem by putting the translation
hardware out at the memory (which I believe is what was done by
Gottlieb et. al. in their supercomputer project, along with also putting
some adders and so on out there), but then you only have one of them,
which means it has to be very fast to avoid a bottleneck.  I recall reading
a comment by Gottlieb about that fairly recently, where he was saying he
wished he'd put his memory management at the processors instead.

Regarding the "segments", I have some difficulty with that term because it
seems to constrain a lot of thinking.  What I am actually proposing is that
you have multiple translation tables for the lower-order bits of your
address, one such table per process, and you select the table number based
on the high-order bits of the address.  The processor would also have a
register identifying the current process number, which it would use to fill
in these high-order bits whenever they would otherwise be zero.
-- 
E. Roskos