Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!bbn.com!lkaplan
From: lkaplan@bbn.com (Larry Kaplan)
Newsgroups: comp.arch
Subject: Re: fork and preallocation (was Re: Paging page tables)
Message-ID: <58184@bbn.BBN.COM>
Date: 13 Jul 90 14:53:50 GMT
References: <920@dgis.dtic.dla.mil> <5830@titcce.cc.titech.ac.jp> <5DL4SPD@xds13.ferranti.com> <5855@titcce.cc.titech.ac.jp>
Sender: news@bbn.com
Reply-To: lkaplan@BBN.COM (Larry Kaplan)
Organization: Bolt Beranek and Newman Inc., Cambridge MA
Lines: 66

In article <5855@titcce.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <5DL4SPD@xds13.ferranti.com>
>	peter@ficc.ferranti.com (Peter da Silva) writes:
>
>>> 1) An utterly broken implementation where some important system
>>> process (such as inetd, ypbind or sendmail) may killed if there
>>> is not enough swap space.
>
>>Alternatively, put the program in a wait state until swap space is available.
>>Deadlocks are possible, but unlikely. Indefinite deferment is more likely,
>
>No.
>
>Once swap space shortage occurs, it will tend to occur continually
>until some large process exits. So, if all such processes put in
>wait states (which is very likely to occur, because active process
>often requires new pages) the situation is deadlock.
>

First, preallocating still has some problems with respect to dynamic
allocations.  While deadlock doesn't occur, random processes can still
die (of their own doing) when some malloc call fails to be able to reserve 
swap space.  This will not happen without pre-allocation, and gives you the
opportunity to get more work done and possibly never have trouble.

Next, there are actually ways to handle the deadlock.  Note that Mach based
implementations are willing to page on just about any vfs available.
This means that Mach will page to Unix filesystems or NFS filesystems if
desired.  The kernel (or some appropriately "wired" daemon) could note
that the system was running out of paging space, and make arrangements to 
either suspend memory consuming processes or mount more filesystems.  Even
if the deadlock actually occurred, you could suspend all the processes waiting
for swap space, and then mount some reserve filesystem.  Some care would then
be necessary to let the important jobs finish.  It may be necessary to continue
jobs selectively instead of all at once, to prevent a repeat of the deadlock.
Even if you need some more memory to get the mounting done, you could kill 
some non-critical system daemon that could be started later (like lpd or 
something).  Later on, you could decide to restart the daemon killed earlier, 
and/or unmount no longer used filesystems.  Eventually, you could return to
normal operation.

This is a little complicated but certainly doable and allows you to not
reserve swap space on memory allocation and to use a true COW fork().
Even if such a daemon were not implemented, some of this could actually be
done by hand by an operator.  People may complain that this is not a truly
general solution, and I would agree.  However, combined with
the added flexibility of no preallocation, it seems justifiable.

As a side note, on the large systems I work on, we don't do preallocation
and have never run out of paging (swap) space.  This is not to say that
we never will, but typical systems have on the order of at least 10 times as
much disk storage as main memory.  In some cases, as much as 100 times more.
Even if the user filesystems are full, some care is taken to leave some other 
partitions available for paging.  I claim it is hard to fill that much disk
space with paging and swapping traffic and still have a usable system.  You'll
probably be thrashing to death long before that.

#include <std_disclaimer>
_______________________________________________________________________________
				 ____ \ / ____
Laurence S. Kaplan		|    \ 0 /    |		BBN Advanced Computers
lkaplan@bbn.com			 \____|||____/		10 Fawcett St.
(617) 873-2431			  /__/ | \__\		Cambridge, MA  02238