Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.csd.uwm.edu!gem.mps.ohio-state.edu!tut.cis.ohio-state.edu!ucbvax!agate!saturn!wilson@carcoar.Stanford.EDU From: wilson@carcoar.Stanford.EDU (Paul Wilson) Newsgroups: comp.os.research Subject: Re: adaptive replacement better than Working Set? Message-ID: <8949@saturn.ucsc.edu> Date: 30 Aug 89 21:53:09 GMT Sender: usenet@saturn.ucsc.edu Organization: Stanford University Lines: 109 Approved: comp-os-research@jupiter.ucsc.edu A couple of people have rather liked my idea for an adaptive replacement policy, but expressed doubts that it could be implemented efficiently. I think it can, without any special hardware (no use bits, even) and with only about a hundred or so statements worth of extra bookkeeping overhead at each page fault. The software would be a bit more complicated, maintaining an extra queue or two, but with only a small time cost. The implementation I'm thinking of would be a variant of the Babouglu-Ferrari LRU approximation. (That is, an initial FIFO queue that feeds into an LRU queue which is access-protected; when a page in the LRU part is accessed, an exception moves it to the head of the FIFO queue again. This gives a good approximation of LRU without the need for use bits. The access-protected LRU queue gives the LRU effect, while the initial FIFO queue gives pages a "grace period" in which they don't incur any more access-protection faults. [Babouglu & Ferrari, "Two-level replacement decisions in paging stores," IEEE Trans. on Computers, 32:12, December 1983].) With a scheme like this, it's easy to keep a record for each page in memory saying approximately how long the longest gap between references is, so far. Whenever a page goes unreferenced for a long time, it tends to get moved into the LRU queue, and the longer it goes unreferenced, the further it gets through the queue. So whenever you have a fault on one of those pages, you update its record (if it got further than it had before). There are two ways you can use this information. One is to evict some pages before they get to the end of the queue -- pages that get pretty far into the queue without being referenced again are good candidates for early eviction. Another possibility is to keep these records for pages that have *already* been evicted -- pages that tend to cause faults because they are re-referenced shortly after eviction can be given special treatment (more time in memory). That way, if the policy screws up on a page once, it learns from it. Such pages can be kept in a separate "troublemaker" LRU queue that presumably moves at a lower speed. (Or pages could be marked so that they can run through the main queue more than once, decrementing a count each time.) This system could pose a problem because you have to keep records for pages that aren't in memory, but a good approximation should be easy. You don't actually have to keep this record for all pages, just a queue of the last N pages, where N is comparable to the number of pages in memory. (Any pages that are re-referenced over a larger interval than that probably should have been evicted anyway, since the space could be put to better use in the meantime. The only problem with this approximation is that over the very long term you may "forget" about some troublemaker pages and have to learn about them again. But a page that is periodically re-referenced at an interval, say, double the LRU queue time will incur one extra page fault and thereafter be retained until it goes a much longer period unreferenced.) The overhead here seems pretty trivial. You keep an LRU queue of recently-evicted pages. This queue is actually also a hash table (You can implement such a thing so that queue and hash table operations are all constant-time, a few statements each). In addition, for each page in either the in-memory queues or recently-evicted queue, you keep a behavior record which is just the longest observed interreference gap. At each page fault, you just check to see if the page is in the recently-evicted table/queue. If so, you put it in the troublemaker queue instead of the regular queue (and bump the LRU page out of the troublemaker queue). A slight added complication: you want the regular queue to steal pages from the troublemaker queue if the troublemaker pages are going unreferenced, and vice versa if they're heavily referenced. Then the system will tend to act like LRU in situations where LRU works well, and like a "hot set" caching strategy otherwise. (That is, only allocating limited "buffer space" to operations that have such poor locality that more buffer space would do no good, e.g. scanning once through more data than will fit in memory.) (This inter-queue page stealing is easy enough to do in an approximate way if both are really FIFO-LRU pairs of queues; if the rear of a queue is not being hit and causing an exception pretty regularly, the queue can afford to be a little smaller. You probably want to rig it so that the troublemaker queue has a period some fixed constant longer than the regular queue's.) These policies seem to me to be easy enough to implement, at least in terms of fixed-space policies; I haven't thought much about the equivalent modification to something like WS. The differential- treatment schemes may also have problems that are similar to the problems fixed-space LRU has relative to (variable-space) WS; at some times a given queue position indicates a different time period than others, because the rate of page faults varies. On the other hand, it seems to me that this may be *exactly* what you want, because it may adapt to aggregate locality changes. When you switch working sets, you fault in a bunch of pages of the new working set. This tends to move the old working set toward the end of the regular queue. Any parts of the old working set that are shared with the new working set will then tend to be referenced in that part of the queue, and will be given special treatment. Thus the pages that are common to multiple working sets will tend to be preferentially retained in memory. Any comments? -- Paul Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.edu Box 4348 Chicago,IL 60680