Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site fortune.UUCP
Path: utzoo!linus!decvax!harpo!eagle!mhuxl!ihnp4!fortune!rpw3
From: rpw3@fortune.UUCP
Newsgroups: net.arch
Subject: Re: 16k vs 68k vs 432 - (nf)
Message-ID: <2279@fortune.UUCP>
Date: Sun, 15-Jan-84 23:01:59 EST
Article-I.D.: fortune.2279
Posted: Sun Jan 15 23:01:59 1984
Date-Received: Tue, 17-Jan-84 01:58:16 EST
Sender: notes@fortune.UUCP
Organization: Fortune Systems, Redwood City, CA
Lines: 52

#R:mddc:-29400:fortune:16500004:000:2562
fortune!rpw3    Jan 15 19:48:00 1984

I agree with Tom Teixeira that a good rule of thumb for page size is the
square root of the typical/mean/median (not maximum) segment size.
This corresponds, by the way, to roughly splitting the memory addresses
in half.

Similar reasoning suggests using two-level page tables, where the top
level page table contains the addresses of second-level page table
pages, which contain physical addresses of pages (360/67, National
16000). Assuming the unreferenced second-level page table pages don't
have to be allocated, for small programs the total page table is small,
whereas for large ones it approximates the square root of the segment size.

While some systems copy the whole page table into a fast RAM during
process switching [UGH!], most systems (and all two-level systems I
know) of leave the page tables in main memory and use a small translation
cache (TB, TLA, TLB are common names) to avoid referencing the actual
page table pages when possible. Eight entries is a minimum for any
instruction set architecture which includes 3-address operations (such
as the 360/370 or the VAX) since the instruction and each operand may
be split across a page boundary. Conversely, for non-pathological cases,
eight entries is already well down the curve of marginal improvement,
so most systems have either 8 or 16 entries.

Such translation lookaside buffer archtectures allow for very fast
context switching. If the instruction that loads the base of the page
table also clears the TLB, no further action is needed. The first few
memory accesses will fail to match in the TLB, which will force accesses
of the page table (cost = two reads), but the total overhead is usually
much less than than loading the segmentation registers of, say, the
PDP-11/70.

Using two-level paging usually allows page tabels to be memory resident,
avoiding the complexities of having page table entries being allocated
from kernel VIRTUAL memory (e.g. VAX).  On the other hand, VERY large
programs that spend lots of time in VERY small regions will carry a
large in-memory overhead in the page tables. (Such programs should
probably be written with overlays or subprocesses, but the world
doesn't always do what's "best".)

Two levels seems about optimum; maybe three could be justified (with effort)
if the system ran a mix of VERY large and VERY small jobs. Most don't.
(VERY large is >4M, VERY small is <16k).

Rob Warnock

UUCP:	{sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3
DDD:	(415)595-8444
USPS:	Fortune Systems Corp, 101 Twin Dolphins Drive, Redwood City, CA 94065