Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site LaBrea.ARPA
Path: utzoo!watmath!clyde!bonnie!akgua!whuxlm!harpo!decvax!decwrl!Glacier!LaBrea!mann
From: mann@LaBrea.ARPA
Newsgroups: net.arch
Subject: Re: I like segmented architectures
Message-ID: <121@LaBrea.ARPA>
Date: Fri, 14-Jun-85 19:44:59 EDT
Article-I.D.: LaBrea.121
Posted: Fri Jun 14 19:44:59 1985
Date-Received: Sat, 15-Jun-85 10:26:00 EDT
References: <276@spar.UUCP> <5653@utzoo.UUCP> <291@spar.UUCP> <1031@peora.UUCP> <118@LaBrea.ARPA> <1053@peora.UUCP>
Organization: Stanford University
Lines: 75

Wow!  I guess it's time for me to pay the price for flaming by trying to
back up my opinion.

First let me say I called the 68451 "utterly ridiculous garbage" in a moment
of weakness.  Normally I don't flame like that, but I've been reading
laser-lovers lately, and suddenly the urge overcame me.  Anyway, my real
opinion of the 68451 is that it's a lot better than no memory managment unit
at all, but it has some serious problems.  Most were pointed out in previous
postings, by people who said in effect "well, of course it has the following
problems, but I don't understand why people hate it so much."  I'll
summarize the problems below.

The basic design of the 68451 forces the operating system to manage physical
(as well as virtual) memory in 2^k sized chunks (called segments), where k
is variable (but must be an integer!).  A 2^k sized chunk must start on an
address that is an exact multiple of 2^k.  You can't choose a uniform small
k (like 10) and simulate a page-oriented MMU because you rapidly run out of
segment descriptors (there are only 32).  The operating system code needed
to manage physical memory in this way is much more complex than what is
needed with a paging MMU.  It is also difficult to avoid wasting a lot of
physical memory in fragmentation and/or spending a lot of time copying data
from one part of physical memory to another when an address space grows.

I'll try to make this clear with an example.  Let's say you are implementing
a Unix-style model of process address space, where there is an upper bound
that is allowed to grow dynamically.  If you try to minimize internal
fragmentation by allocating very little more actual memory than the process
is using at any given time, you use up a lot of segment descriptors since
you need one for each 1 bit in the binary representation of the memory size.
Then let's say you try to grow the space.  Each time you grow it, you have
the choice of doing it by tacking on a small chunk to the end (chewing up
another segment descriptor), or replacing one or more of the existing chunks
by larger ones.  Replacing some chunk(s) by a larger one requires copying
their data to a new place in physical memory, unless you had the good fortune
to find that the existing chunks were contiguous and started on the right
boundary in physical space, and the space following them was free.
One could put in some heuristics to make this condition more likely, but
these make the OS memory management code yet more complicated.

An advertised benefit of the 68451 is that it allows fast context switching
because you can keep the segments for multiple address spaces in the chip at
the same time.  Of course, with only 32 segment descriptors to go around and
each address space needing several, you run out rather quickly if you try to
keep them all in there.  So you need to swap them in and out, perhaps
keeping the N most recently used sets in the chip in hopes of minimizing the
amount of descriptor swapping.

Protection bits are also on a per-segment basis, so if you have different
areas in your process that need different protections, you chew up more
segment descriptors -- each differently-protected area needs several segment
descriptors if its size is not an exact power of two.

The fact that the 68451 introduces TWO wait states into memory access is a
serious black mark against it in my book, too.

Of course, it's possible to live with all of this.  And user programs still
see a nice, simple, linear address space.  But I wouldn't want to try to do
demand paging with the 68010 on top of the 68451 -- demand paging is
complicated enough with a more rational MMU.

What are my qualifications for saying all this?  I've written all the kernel
memory managment code for the Sun-1, Sun-2 and Iris PM-II versions of the
Stanford V kernel.  (See IEEE Software, April 1984 for an article about the
V kernel.)  These machines all happen to be 68000/68010-based systems with
custom MMUs built with mostly TTL.  The current V kernel implements multiple
address spaces with variable upper limits, and multiple processes per
address space.  There is currently no demand paging -- all code and data must
be resident at all times -- but I will be implementing that soon as a
byproduct of some of my thesis research.  (Adding swapping to the current
system would be trivial.)  I formed my opinions about the 68451 in the
process of trying to explain the current memory management to some folks who
are trying to port the kernel to a system with a 68451, and helping them
find solutions to the problems they ran into.

	--Tim