Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site LaBrea.ARPA Path: utzoo!watmath!clyde!bonnie!akgua!whuxlm!harpo!decvax!decwrl!Glacier!LaBrea!mann From: mann@LaBrea.ARPA Newsgroups: net.arch Subject: Re: I like segmented architectures Message-ID: <121@LaBrea.ARPA> Date: Fri, 14-Jun-85 19:44:59 EDT Article-I.D.: LaBrea.121 Posted: Fri Jun 14 19:44:59 1985 Date-Received: Sat, 15-Jun-85 10:26:00 EDT References: <276@spar.UUCP> <5653@utzoo.UUCP> <291@spar.UUCP> <1031@peora.UUCP> <118@LaBrea.ARPA> <1053@peora.UUCP> Organization: Stanford University Lines: 75 Wow! I guess it's time for me to pay the price for flaming by trying to back up my opinion. First let me say I called the 68451 "utterly ridiculous garbage" in a moment of weakness. Normally I don't flame like that, but I've been reading laser-lovers lately, and suddenly the urge overcame me. Anyway, my real opinion of the 68451 is that it's a lot better than no memory managment unit at all, but it has some serious problems. Most were pointed out in previous postings, by people who said in effect "well, of course it has the following problems, but I don't understand why people hate it so much." I'll summarize the problems below. The basic design of the 68451 forces the operating system to manage physical (as well as virtual) memory in 2^k sized chunks (called segments), where k is variable (but must be an integer!). A 2^k sized chunk must start on an address that is an exact multiple of 2^k. You can't choose a uniform small k (like 10) and simulate a page-oriented MMU because you rapidly run out of segment descriptors (there are only 32). The operating system code needed to manage physical memory in this way is much more complex than what is needed with a paging MMU. It is also difficult to avoid wasting a lot of physical memory in fragmentation and/or spending a lot of time copying data from one part of physical memory to another when an address space grows. I'll try to make this clear with an example. Let's say you are implementing a Unix-style model of process address space, where there is an upper bound that is allowed to grow dynamically. If you try to minimize internal fragmentation by allocating very little more actual memory than the process is using at any given time, you use up a lot of segment descriptors since you need one for each 1 bit in the binary representation of the memory size. Then let's say you try to grow the space. Each time you grow it, you have the choice of doing it by tacking on a small chunk to the end (chewing up another segment descriptor), or replacing one or more of the existing chunks by larger ones. Replacing some chunk(s) by a larger one requires copying their data to a new place in physical memory, unless you had the good fortune to find that the existing chunks were contiguous and started on the right boundary in physical space, and the space following them was free. One could put in some heuristics to make this condition more likely, but these make the OS memory management code yet more complicated. An advertised benefit of the 68451 is that it allows fast context switching because you can keep the segments for multiple address spaces in the chip at the same time. Of course, with only 32 segment descriptors to go around and each address space needing several, you run out rather quickly if you try to keep them all in there. So you need to swap them in and out, perhaps keeping the N most recently used sets in the chip in hopes of minimizing the amount of descriptor swapping. Protection bits are also on a per-segment basis, so if you have different areas in your process that need different protections, you chew up more segment descriptors -- each differently-protected area needs several segment descriptors if its size is not an exact power of two. The fact that the 68451 introduces TWO wait states into memory access is a serious black mark against it in my book, too. Of course, it's possible to live with all of this. And user programs still see a nice, simple, linear address space. But I wouldn't want to try to do demand paging with the 68010 on top of the 68451 -- demand paging is complicated enough with a more rational MMU. What are my qualifications for saying all this? I've written all the kernel memory managment code for the Sun-1, Sun-2 and Iris PM-II versions of the Stanford V kernel. (See IEEE Software, April 1984 for an article about the V kernel.) These machines all happen to be 68000/68010-based systems with custom MMUs built with mostly TTL. The current V kernel implements multiple address spaces with variable upper limits, and multiple processes per address space. There is currently no demand paging -- all code and data must be resident at all times -- but I will be implementing that soon as a byproduct of some of my thesis research. (Adding swapping to the current system would be trivial.) I formed my opinions about the 68451 in the process of trying to explain the current memory management to some folks who are trying to port the kernel to a system with a 68451, and helping them find solutions to the problems they ran into. --Tim