Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!henry From: henry@utzoo.UUCP (Henry Spencer) Newsgroups: net.arch Subject: Re: VERY LARGE main memories Message-ID: <7094@utzoo.UUCP> Date: Sat, 6-Sep-86 22:01:32 EDT Article-I.D.: utzoo.7094 Posted: Sat Sep 6 22:01:32 1986 Date-Received: Sat, 6-Sep-86 22:01:32 EDT References: <1130@bu-cs.bu-cs.BU.EDU> <7144@lanl.ARPA>, <7148@lanl.ARPA> Organization: U of Toronto Zoology Lines: 59 > The desirability of paging for such machines is not so obvious. > Consider a code which updates a large array on each step through a > loop (each time-step). If the central memory is too small to hold the > entire array and you have a virtual memory scheme, some part of the > array will get swapped out on each time step. Most likely, it will be > the least recently used page that gets swapped - the very one that you > will need first on the subsequent time step!... Jim, all that you have established here is that LRU is a thoroughly bad virtual-memory policy for a scientific program. Few people will argue that. You have also more-or-less established that a program which behaves in the manner you suggest will not benefit much from virtual memory; its performance will degrade badly when it starts paging. Few will argue that either. Not all programs behave that way, though. You have *not* established that virtual memory, as such, is a poor idea. It is quite possible to combine demand fetching with prefetching of things that are expected to be needed soon. It's probably even a good idea when trying to page scientific programs. It *is* harder to get right, which is why you don't see it done much. > Without virtual memory though, your code can anticipate the problem by > initializing asynchronous I/O long before it needs to use the data. > And, since it's not driven by page faults, you can select only a > particular part of the array to be swapped - thus minimizing I/O. > This kind of programming effort is somewhat unfasionable these days... With some reason. What you're saying is that because the operating-system people are too lazy to devise paging algorithms that are useful for large scientific programs, the programmers should be required to do it themselves. Apart from the matter of constantly reinventing the wheel, there is also the problem that it's a lot of work to get it right -- program reference patterns are notorious for being hard to predict beforehand, which means experimenting and then twiddling the code to match the results. This may perhaps not be needed for really straightforward array-mashing code, but I remain a bit skeptical: historically, the batting average on statements like "this program obviously has the following reference pattern..." is close to zero. I wouldn't be surprised if a lot of scientific code, with its carefully hand-twiddled asynchronous I/O, is in fact managing its memory rather inefficiently. Especially if the code has been revised, or moved to a new machine (or a new variant of the old one), since the last tuning job was done. > ... They bought the machine because > the critical issue was SPEED - and anything that reduces this speed > (like virtual memory) is to be shunned. (Cyber 205 users usually > turn off the virtual memory when they need speed, Crays don't even > have virtual memory.) I think you may be confusing two issues here. The reason the Crays don't have virtual memory is not because asynchronous I/O is superior to paging, but because non-trivial address translation hurts memory-access time. Are the Cyber 205 users turning the virtual memory off because they don't trust the paging algorithm, or because the machine will run even a memory-resident program faster with it off? I'd bet it's the latter. Now *that* is a legitimate and well-justified reason for not using virtual memory. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry