Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!cmcl2!lanl!jlg
From: jlg@lanl.ARPA (Jim Giles)
Newsgroups: net.arch
Subject: Re: VERY LARGE main memories: crypt
Message-ID: <7847@lanl.ARPA>
Date: Tue, 23-Sep-86 20:55:12 EDT
Article-I.D.: lanl.7847
Posted: Tue Sep 23 20:55:12 1986
Date-Received: Wed, 24-Sep-86 22:01:37 EDT
References: <1276@bu-cs.bu-cs.BU.EDU> <7537@lanl.ARPA> <1819@mmintl.UUCP>
Reply-To: jlg@a.UUCP (Jim Giles)
Organization: Los Alamos National Laboratory
Lines: 64

In article <1819@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>In article <7537@lanl.ARPA> jlg@a.UUCP (Jim Giles) writes:
>>In article <1276@bu-cs.bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
>>>Am I missing something? What's the huge problem with this hypothetical
>>>vadvise(CYCLIC)? Isn't it just MRU (most recently used)?
>>The are two thing wrong with this scheme.  1) Suppose that you problem
>>fails to fit in main memory by only a few page sizes.  The above scheme
>>will cycle each and every page of the whole data structure into memory
>>and back out again on every time step.
>
>No it won't.  Only the number of pages which don't fit are read in each
>cycle.  (E.g., if there are 103 pages and 100 will fit, 3 are read each
>cycle.)  In fact, MRU behaves optimally[1][2] for a purely cyclic access
>pattern.  In particular, it has less I/O overhead than setting aside a few
>pages at the end to be overlaid.
>
The MRU algorithm (as far as I can see) does the following:
    Assume M pages of data are to be used cyclically and only N pages
    will fit.  Initially, the first page is loaded and the code begins
    executing.  At the end of page 1, the next page is loaded, and so
    on for N pages.  Now you have a choice to make, do you flush page
    1 or page N to make room for N+1.  The MRU algorithm presumably
    flushes page N (the most recently used).  And so you continue up
    to page M.  Now on the second cycle through the data, the first N-
    1 pages are already there, so no page fault occurs until page N is
    needed.  Then a page fault occurs for every page until page M.

This doesn't seem optimal to me.  Or did I get the algorithm wrong?
In this case, even LRU (least recently used) seems more efficient since
some, at least, of the I/O is done asynchronously.  The most efficient
algorithm would be to load N pages at the start, when done with the first
page flush it out and load the N+1st page - and so on: always flushing the
most recently used and loading the most distantly required.  This makes
all the I/O as asynchronous as possible.

If M-N is small compared to N, you could reduce the I/O requirement by
loading N-M pages, skipping every M-Nth one, and then after every M-N
pages, flush the most recent and load the most distantly needed (as
before except that only M-N pages are being swapped in and out).

There are other techniques (that I've mentioned in previous
submissions) that reduce the I/O load or increase the amount of
asynchronous I/O in many cases.  Most of these techniques would be
difficult to provide in a general purpose VM system.

You might argue that MRU should be modified to include the technique
I just mentioned.  But how does a demand driven system know what the value
of M is?  How does it know that M remains unchanged from one cycle to the
next?  What is some other data structure (like an equation of state, for
example) is being referenced following a random pattern of use - at the
same time that the cyclically used data is being traversed: how do
I advise the VM system that some of the pages should be referenced
MRU and some should be LRU?

I can solve all the above problems explicitly if I have control over
memory contents.  I can't off hand see any solution that a VM system
could take advantage of.  A lot of work has been done in the
optimization of memory management for scientific codes.  Many of the
techniques that have resulted are not well suited to any demand driven
context but, instead, rely upon knowledge of the data use patterns of
specific algorithms.

J. Giles
Los Alamos