Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!brutus.cs.uiuc.edu!wuarchive!wugate!uunet!yale!leichter From: leichter@CS.YALE.EDU (Jerry Leichter) Newsgroups: comp.arch Subject: Re: Memory utilization & inter-process contention Message-ID: <70890@yale-celray.yale.UUCP> Date: 28 Aug 89 22:56:02 GMT Sender: root@yale.UUCP Organization: Yale Computer Science Department, New Haven, Connecticut, USA Lines: 103 X-from: leichter@CS.YALE.EDU (Jerry Leichter (LEICHTER-JERRY@CS.YALE.EDU)) In article <2389@auspex.auspex.com>, guy@auspex.auspex.com (Guy Harris) writes... >>I must be lucky then. VMS has had one since day one. Working set >>parameters can be set by either authorization information, as parameters >>to the $CREPRC system service (which creates a process) or explicitly >>during process execution (SET WORKING_SET command, or $ADJWSL system >>service). > >OK, so which of the OSes that support working set scheduling do so >without having to be told by some external agent what the working set of >a process at some given time is? (Do the VMS calls even tell the OS >that, or do they just tell it how big the working set is?) They only specify the size. A process in VMS comes equipped with three significant numbers controlling working set allocation: The limit, quota, and extent. The working set limit is essentially the "zero point" for working set adjust- ment: When an image exits, the process's working set is set back to its working set limit. (Recall that VMS doesn't create a new process per executed image; rather, it re-uses the existing context. This is done by having two threads of control within each process. User code runs in the user-level thread; DCL, the equivalent of the Shell, runs in the supervisor-level thread. (Actually, there are actually two more threads, executive and kernel, but they aren't at issue here.) When user-level code exits, all user-level structures are discarded, all user-level channels are closed, and all pages of virtual memory owned by user mode are freed. DCL then goes on to the next command. In over-all effect, this is like the Shell's use of fork/exec except with light-weight threads (shared memory space). It's not EXACTLY like light- weight threads as they are normally pictured because the hardware protects the supervisor-mode thread from the user-mode thread.) Anyhow, returning to working sets: The second parameter, the working set quota, is the amount of VM that the process is guaranteed to have available to it. On a busy system, it is the upper bound on the process's working set size. If the system has plenty of spare memory, a process is allowed to exceed its working set quota. The third parameter, working set extent, is a hard limit on the size of a process's working set. When a process with a working set size below its quota needs pages, and none are available, the system will grab them back from processes that have "borrowed" pages and are over quota. (If there are no "borrowed" pages to be found, the system takes a number of other actions; ultimately, it swaps entire processes out, making all of their pages available to satisfy the demand for "guaranteed" pages.) The net effect is that when the system is short of memory, processes page against themselves, while when the system has memory to spare, they page against the system-wide pool of pages, but are still subject to an upper limit on size. There are various parameters that control things like how many pages the system must have free before it will allow processes to exceed their working set quotas. In fact, there are probably 10 or 20 user-settable parameters to let you tune the performance of the system. Some can be changed on a running system; others require a re-boot. Setting them is something of a black art; fortunately, standard VMS comes with programs to set them auto- matically, based on actual workload. >Jerry Leichter claimed that the VMS software people showed that a >reference bit is not necessary, given the appropriate algorithms; does >VMS how have appropriate algorithms to determine the contents of the >working set without requiring a reference bit, or did the designers >decide that the appropriate algorithm is to believe what the user or the >application tells you and not try to figure it out for yourself? (Not a >rhetorical question - I'm willing to accept that the latter is the >appropriate algorithm, given sufficient evidence to demonstrate that >claim.) The basic algorithm is simple: Pages lost from a process working set go to the end of a free list. If a fault occurs for a page on the free list, it can be given back to the process; its contents will be unchanged. The lists are not actually searched, since the structures defining virtual memory contain sufficient information to find a page still on the free list with little effort. Such "soft faults" are quite fast - not as fast as not having incurred the fault at all, of course, but perhaps a few tens of in- struction times. There's more to this, of course. For example, a "dirty" page can't go to the free list; instead, it goes to a dirty list. When the dirty list gets long enough, the modified page writer writes it out and eventually moves the page to the free list. The page remains eligible for return to the process via a "soft" fault until it is actually given to some other process. Under normal conditions, pages actually within the working set of a process are almost certain to be found on a free list even if grabbed. There are all sorts of additional twists. For example, some pages can be locked in the working set. Any program can do this, though it's rare that user-mode programs do. Rather, things like pages containing page tables get locked in the working set. (Some pages HAVE to be locked in the working set to avoid deadlocks in the pager.) Also, when VMS scans for pages to "steal" from a process, it keeps enough context to try to avoid recently-faulted-in pages. (I don't recall the details.) Does this all work? Well, both VMS (and Unix these days) seem to support vir- tual memory on VAXes with a great deal of success. I haven't seen anyone claim to have an architecture of roughly equivalent CPU power/memory and I/O bandwidth/etc. but WITH page reference bits which supports more processes/ does measurably better at VM management/etc. The VAX architecture has been pretty heavily studied - there are a number of published papers - and none of them that I have read has found a bottleneck in the page replacement code. -- Jerry