Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rochester!ritcv!cci632!rb From: rb@cci632.UUCP (Rex Ballard) Newsgroups: net.arch Subject: Re: paging and loading Message-ID: <389@cci632.UUCP> Date: Mon, 22-Sep-86 15:08:33 EDT Article-I.D.: cci632.389 Posted: Mon Sep 22 15:08:33 1986 Date-Received: Mon, 22-Sep-86 21:33:29 EDT References: <832@hou2b.UUCP> Reply-To: rb@ccird1.UUCP (Rex Ballard) Organization: CCI, Rochester Development, Rochester, NY Lines: 126 Summary: Situations for each. In article <832@hou2b.UUCP> dwc@hou2b.UUCP (D.CHEN) writes: >>>3) The real reason for virtual memory (and the one which won't >>>away when memories get big) is that I can quickly load [a] >>>working set of a large program ... > >>... but it seems to me that you'll use up [the faster startup] and >>a lot more in page faults. Since you are reading the program >>piecemeal into virtual memory you are going to be a lot slower >>because of the extra seek and rotational delays. It is sort of like >>a guy driving on a surface street, instead of going over and getting >>on the freeway. You get started faster, but you hit all those >>traffic lights. The only case that is valid is if the overwhelming >>majority of the pages in the program are never referenced. In computing, there is an appropriate analogy to the "freeway" described. This would be the single process, single processor, single task, single level, single loop, in which data is either accessed in a nearly "random" or "large loop" fashion. For such applications, such as scientific matrix manipulation or calculations, on single processor systems like the Cray 1, VM is definitely a lose. The next question is, is this model still apropriate? Scientific and number crunching applications are finding a better home in multi-processor environments, which effectively become multi-tasking systems even when only a single "program" is being run. >another analogy (and one that i've been using) is this: imagine >that you have to xerox ten sets of notes and each set consists of >ten pages. if there is a large setup time on the machine, you would >like to copy the 100 pages in one shot. even without a large setup >time, if there are other people on line, you would probably want to >copy all of your work in one shot instead of getting on the end of >the line every 10 pages. Ok, let's continue with this. Do you really want to make the person who needs two copies wait until your "batch job" is complete? It may be more desirable to have a second machine for the "smaller jobs", or to split the 100 page copy among more machines, by makeing one copy and putting that on another machine. If you make the "little jobs" wait, then they are more likely to collect many "little jobs" to get one "big job" that is "worth the wait". I even lean toward the "grocery store" analogy. Many such stores have "express lanes" so that the person with 2 items doesn't have to wait for five people who are buying for the month. The "bulk buyers" line may also be staffed with a bagger, a scanner, and a nice conveyer belt, while the "express lane" might only be a counter and a manual cash register. Many stores adhere to a "3 person" rule, where clerks are added as the number of people in a line exceeds 3, be they express or bulk. Without this technique, people would only come to the grocery store when they were buying large quantities, and go to other stores for small purchases. >the answer is "it depends". if your executable is on a unix file >system, you probably would have to do multiple i/os to load the >entire address space anyway. however, if it is contiguous on some >swap device, then it depends on program behavior. Some Unix variations, which are specifically designed to support VM, keep executable binaries in contiguous form. The read-only portion need not be mapped or swapped. >one important aspect that people rarely consider when talking about >response time and loading is what happens if, in the process of demand >paging (and loading of "working sets"), memory runs out? what are >the implications on response time then? it would seem that without >any other aids, pure demand paging is a clear loser in this situation. Depending on the mechanism used, this might require up to 500 processes to be running simultaneously. Assuming only a modest 1 meg memory of 1K pages. There may be an argument for "request without fault" type accesses, where the OS could be advised by the application that a new page will be needed in the near future. >danny chen >ihnp4!hou2b!dwc There are several types of processing. Batch, interactive, pipelined, I/O intensive, and transaction. Some unpleasant experiences with a non-virtual memory transaction processing system have given good cause to prefer VM for this type of application. Similar experiences with interactive applications also lead to the same conclusion. Even a batch processing compiler that attempted to do "hand optimised" overlays in a VM environment (thrashed itself to death), tend to make a case for VM. Most applications spend 90% of their time executing only 10% of their code. Interactive and transaction processing applications spend nearly 95% of their time in a "wait for I/O, parse, loop" with an occaisional "do special purpose 'case' processing". There are more than a few stories of such applications where swapping that involved a loop smaller than 2K caused serious degradation due to the 64K to 1Mb "tails" that would get swapped in with them. Perhaps when it is possible to get 4 Gigabytes of 50ns ram for $200-$300 that will require only a few watts, VM will become unnecessary. Even then however, the lack of need for "heap compaction" and the ability to "remap" instead of "copy" data from one place to another will continue to be a win for most applications. Even this statement should have :-)'s all over it. I remember thinking that 2K was a lot of memory on my VIP, the transition from PDP-8 to PDP-11 thrilled many because of the thought of 64K addressing space. Microsoft's Bill Gates couldn't imagine why anyone would need more than 640K with MS-Dos, and even Motorola was surpised to discover that some systems couldn't fit in the 16 megabyte virtual address space of the 68010. There is also the issue of software costs in relation to the system costs. When adding an enhancement requires 10-50 times the actual enhancment costs to "make room" for the enhancement, the long-term systems costs can get out of hand very quickly. With complete, detailed structure charts, data-flow diagrams and data-hierarchy, it is possible to "manually" support anything from simple "physical=logical" addressing, bank switching, overlays, segment registers, and/or swapping to "virtual memory with virtual libraries". For "automatic" support via linkers and complers, however, the benefits of "virtual memory" are difficult to match. Finally, one of the main trade-offs of VM is the TLB lookup time. When the processor is spending 10% of it's time "looking for something to do", CPU speed becomes much less important than Memory Management, Disk Caching, and Co-processor interlock.