Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site watdcsu.UUCP
Path: utzoo!watmath!watnot!watdcsu!herbie
From: herbie@watdcsu.UUCP (Herb Chong [DCS])
Newsgroups: net.arch
Subject: Re: Cache revisited
Message-ID: <1560@watdcsu.UUCP>
Date: Thu, 25-Jul-85 10:26:35 EDT
Article-I.D.: watdcsu.1560
Posted: Thu Jul 25 10:26:35 1985
Date-Received: Fri, 26-Jul-85 07:25:18 EDT
References: <5374@fortune.UUCP> <268@gcc-bill.ARPA>
Reply-To: herbie@watdcsu.UUCP (Herb Chong [DCS])
Distribution: net
Organization: U of Waterloo
Lines: 56
Summary: 

In article <268@gcc-bill.ARPA> brad@gcc-bill.UUCP (Brad Parker) writes:
>I'd like to compare and contrast the difference in performance between a
>simple single level paged memory manager using a ram (a la Sage 68000) and
>a system like the IBM DAT box, where the page tables are stored in main memory
>and cached in hardware. The point being that switching context is MUCH
>faster if you only need to change the pointer to the page tables, rather than
>copy 8K of paging information into the page table ram. It is assummed that
>the cache used to speed up the main memory page table accesses is sufficiently
>large to get a good hit rate (what ever that may be).

i've been doing a lot of reading lately on storage management at the
kernel (or as IBM prefers to call it, the nucleus) level of 370 and
370-XA machines because i may be working on kernel code for those
machines soon.  anyway, i should point out that there are two sets of
tables used by DAT.  there are segment tables in addition to page
tables.  segments are 1Mbyte and pages are 4Kbytes.

each address space (which can contain many processes but all owned by
the same user) has it's own segment table entries which point to page
tables for that user.  all the processes in a single address space
occupy various sections of virtual memory and operate as co-routines so
that only one process can ever be running at one time in an address
space and control is transfered between processes by explicit calls by
the co-routines.  because all processes in an address space share the
same virtual memory, each can see all the others if it wants to, unlike
unix processes which are isolated from each other in terms of storage.

when a context switch is performed by the CPU, the hardware saves away
status in some block of storage and changes a segment table pointer
before loading new status of the next address space to execute.
i believe the actual size of information moved is on the order of
128 bytes, but i'm not completely sure.

the DAT hardware maintains a cache of segment and page table entries
(called the Translation Lookaside Buffer, TLB) which improves overall
performance because all storage references, whether by instruction
fetch or operand access, require information in the segment and page
tables.  the hardware maintains this cache, although there are
instructions provided for manipulating the entries.

the net result is a much more complex CPU and memory manager.  it would
be very interesting to compare a 68000 system to a single chip (or even
dozen chip) implementation of the full 370 hardware.  there is also
provision for prefix control where multiple CPU's can refer to the same
real address, but the memory manager uses the CPU prefix to decide
where the real block of storage is in real memory.  this only happens
for page 0 of memory.  you get the idea.

Herb Chong...

I'm user-friendly -- I don't byte, I nybble....

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!water!watdcsu!herbie
CSNET: herbie%watdcsu@waterloo.csnet
ARPA:  herbie%watdcsu%waterloo.csnet@csnet-relay.arpa
NETNORTH, BITNET, EARN: herbie@watdcs, herbie@watdcsu