Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!caip!think!husc6!seismo!cmcl2!lanl!jlg
From: jlg@lanl.ARPA (Jim Giles)
Newsgroups: net.arch
Subject: Re: paging and loading
Message-ID: <8523@lanl.ARPA>
Date: Sun, 12-Oct-86 10:31:08 EDT
Article-I.D.: lanl.8523
Posted: Sun Oct 12 10:31:08 1986
Date-Received: Fri, 17-Oct-86 09:12:30 EDT
References: <832@hou2b.UUCP> <7597@lanl.ARPA> <78@alberta.UUCP> <950@usl.UUCP>
Reply-To: jlg@a.UUCP (Jim Giles)
Organization: Los Alamos National Laboratory
Lines: 93

>>>Note that the computational speed of the Cray really does exceed I/O speed
>>>by about a factor of 10^6 - page faults in this kind of environment are
>>>EXTREMELY costly. 
>
>Not at all. The I/O processor does all the work of reading a page off
>of disk and stuffing it into memory (via DMA). Meanwhile, the CPU goes
>off and runs somebody else's job, taking advantage of that explicit
>multiprocessing. Assuming you're not using the entire machine for a
>single batch job, that is. So even if you have a working set larger
>than available memory, if you have a good scattering of jobs so that
>thrashing doesn't occur, the CPU still gets the same amount of work
>done -- it's just divided amongst users, instead of all happening to
>the same process.

Thank you for making my point so clearly!  There applications (and whole
computing environments) where turnaround is MUCH MORE important than
throughput.
>
> Also note that page faults do not occur unless the working set of the
>program exceeds available physical memory -- if a page fault occurs,
>it means your program wouldn't have run without paging, anyway.

Sure it does! And more efficiently too!  Paging means that the program
must make do with the memory management and I/O scheduling that is
provided by the VM system.  The VM system is built with all sorts of
compromises since it must work acceptably well for many different
data usage patterns - it cannot make use of any specific knowledge
of the the data usage that is known to the programmer.  (oh, yes.
Some people have pointed out that the VM system COULD be augmented
with special calls to allow the programmer to inform it of the
data usage patterns to be expected.  There are three problems with this
approach: 1> Such calls are quite vague - at least as proposed, 2> the
use of such calls is as much a programming chore as explicit memory
management anyway, 3> no existing VM system I'm aware of has this type
of call implemented.)

> ...  And if
>a page fault DOESN'T occur (that is, if the program would have run in
>physical memory), then you can discard the overhead of page faults
>altogether, and just concentrate on the overhead of virtual address
>translation. Loading is loading. Loading 64K of data via 8 page faults
>should not be any more expensive than loading 64K of data in one fell
>swoop.
>
Not true.  8 page faults is 8 system calls (or equivalent); probably 8
disk seeks (unless you are alone of the system and no other user has
requested data from that disk since the last page fault); also it's
8 disk revolutions (unless the page faults are so close together that
the disk hasn't moved past the start of the next sector before you
ask for it).  The truth is that a single long consecutive disk transfer
is ALWAYS more efficient than a lot of small transfers.

>>>desireable to make the pages large.  That way you can take advantage of the
>>>fact that most I/O comes from consecutive disk locations - therefore there
>>>are fewer seeks with large page size.
>
>I have never seen a disk system that did not fragment. Unless the disk
>page size was also similiarly large. Upon which you get lots of wasted
>space, because most files are about 1K long or so, and you'd have
>these mungo 16K disk pages only 1/16th full... Imagine, a 160 megabyte
>disk with only 10 megabytes used on it, totally full.
>
The CTSS (Cray Time Sharing System) requires an initial length in order
to create a file.  The file is then allocated CONSECUTIVELY (if there
is a large enough hole - some idle system time is spent consolidating
holes).  The file is only segmented if it is extended after its creation.
Even for an extension, the file is extended consecutively if possible.
In practice, most files are consecutive for most of their length.

Also, very little space is taken up by files as small as '1K long or so'.
To be sure, there are a lot of such files, but it would take 32,000 of
them to to equall the size of an executing program image (I assume you
mean 1k bytes, program images can be 4M words).  Also - the disk sector
size is the granularity of disk allocation: it is 512 words no matter
what size the 'pages' are.  Changing page size does not effect disk
allocation in any way.

I see that you really have no idea whatsoever about the evironment of
problems inherent in large memory systems.  Even your examples give you
away - 160MB is not a very large disk drive, 1KB is not a typical file
size, etc..  Our disk drives hold about 1.2GB, files can be as large as
40MW (=320MB), average user data files are over 1MW (=8MB) - only source
code is as small as 1KB (which is rounded up to 512 words = 4KB because
that is the sector size of the disk drives).  Such small code soures
are also rare - the main users have codes which amount to 100,000 to
500,000 lines of Fortran (not counting the library support).  In short,
you are talking about a completely different environment than I am.

Learn something about large computing environments before you try to
force unwanted features upon them!

J. Giles
Los Alamos