Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!uwvax!astroatc!prairie!dan
From: dan@prairie.UUCP (Daniel M. Frank)
Newsgroups: comp.arch,comp.sys.nsc.32k,comp.sys.intel,comp.sys.m68k
Subject: Re: Question:  on-chip or off-chip MMU?
Message-ID: <441@prairie.UUCP>
Date: Fri, 24-Apr-87 22:53:28 EDT
Article-I.D.: prairie.441
Posted: Fri Apr 24 22:53:28 1987
Date-Received: Sun, 26-Apr-87 23:33:06 EDT
References: <5635@shemp.UCLA.EDU>
Reply-To: dan@prairie.UUCP (Daniel M. Frank)
Organization: Prairie Computing, Madison, Wisconsin
Lines: 59
Xref: mnetor comp.arch:1104 comp.sys.nsc.32k:117 comp.sys.intel:203 comp.sys.m68k:413

In article <5635@shemp.UCLA.EDU> fan@CS.UCLA.EDU (Roy Fan) writes:
>	What are the important deciding factors in designing a MMU
>on-chip or off-chip?
>
>	Question 1 :  are there any other factors that might affect the
>design of the MMU being on-chip or off-chip?

   Your list is pretty complete, but I'd like to broaden the discussion
a bit.  We can break the speed/pins issue down into two areas:  one is
the raw technological problem of pins and propagation delays, the other
is architectural.

   Short of new technologies such as optical computers and brute-force
methods such as ECL and freon cooling, the only way to speed up a uni-
processor is to increase parallelism.  This can be done by pipelining,
which allows multiple instructions to be in various stages of execution
simultaneously, and by increasing the parallism at each stage of the
pipeline.  Two ways to achieve the latter are to do operand validity
checking in parallel with other operations, and to do it as early as 
possible.  The advantage of doing it in parallel should be clear.  The
advantage of doing it early is that we can avoid having to throw many
instructions out of the pipeline.

   Anyway, the more parallism you want, the more integrated your memory
management hardware has to be with the CPU.  If you put it off-chip,
you'll need more pins, and the propagation delays may slow down your
overall cycle time.

   The architectural issue is more subtle.  Can we tailor our architecture
in such a way that we can either inform the chip early about our addressing
intentions, or break such information up so that there is less work to do
at critical times?  I claim that the 80x86 series does just that (whether
well or badly) by checking segment validity at segment register load time,
leaving only boundary and page presence checking for the execution of actual
references.  This is probably less interesting on the 80386, where segment
register loads are bound to be much less frequent than on its predecessor.

>	Question 2 :  if there is enough space on the chip, would
>everybody put the MMU on-chip?

   I suppose so.  If there was enough space (and they could cool it!), 
they'd try to put EVERYTHING on the chip.

>	Question 3 :  if there is only enough room for either a cache
>or a MMU, which one will prevail?

   My knee-jerk response is:  it is so hard to really integrate an external
MMU with a pipelined processor, that you'll win by putting the MMU and a
small cache on chip, and putting a larger cache off-chip.  I hear the two
level cache worked out pretty well in the Microvax.

   [The preceding was an "architectural discussion".  In some circles, this
is also known as a "religious discussion".  Consider yourself warned.]

-- 
      Dan Frank (w9nk)
	ARPA: dan@db.wisc.edu			ATT: (608) 255-0002 (home)
	UUCP: ... uwvax!prairie!dan		     (608) 262-4196 (office)
	SNAILMAIL: 1802 Keyes Ave. Madison, WI 53711-2006