Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!kddlab!titcca!cc.titech.ac.jp!necom830!mohta
From: mohta@necom830.cc.titech.ac.jp (Masataka Ohta)
Newsgroups: comp.arch
Subject: Re: fork and preallocation (was Re: Paging page tables)
Message-ID: <5870@titcce.cc.titech.ac.jp>
Date: 16 Jul 90 08:02:08 GMT
References: <920@dgis.dtic.dla.mil> <5830@titcce.cc.titech.ac.jp> <5DL4SPD@xds13.ferranti.com> <5855@titcce.cc.titech.ac.jp> <58184@bbn.BBN.COM>
Sender: news@cc.titech.ac.jp
Organization: Tokyo Institute of Technology
Lines: 100

In article <58184@bbn.BBN.COM> lkaplan@BBN.COM
	(Larry Kaplan) writes:

>While deadlock doesn't occur, random processes can still
>die (of their own doing) when some malloc call fails to be able to reserve 
>swap space.

Failing malloc is very different from being killed at random.

The situation is well under control. Important processes can be programmed
to try mallocing several times with exponential back-off. Even if it dies,
it can cleanup its environment.

>This will not happen without pre-allocation, and gives you the
>opportunity to get more work done and possibly never have trouble.

First, it is very probable (except for mmap) that malloced area is
actually used. So, if there is not enough swap space, processes will
almost certainly be killed.

Without pre-allocation, a process will be killed without notice.

There is no opportunity for retry nor for graceful shutdown.

It can be a great trouble.

>Next, there are actually ways to handle the deadlock.

Yes, it is always possible to resolve deadlock by human intervention.

>Note that Mach based
>implementations are willing to page on just about any vfs available.
>This means that Mach will page to Unix filesystems or NFS filesystems if
>desired.  The kernel (or some appropriately "wired" daemon) could note
>that the system was running out of paging space, and make arrangements to 
>either suspend memory consuming processes or mount more filesystems.

Of course, it is not impossible, it is just next to impossible.

By the way, suspending memory consuming processes is the worst thing to do.
The consumed memory will not be released until those processes will be
reactivated and the processes will not be reactivated until a large
amount of memory is released. The situation is partial deadlock, and
meanwhile, other processes will easily consume the rest of the memory, causing
system wide deadlock.

>Even
>if the deadlock actually occurred, you could suspend all the processes waiting
>for swap space, and then mount some reserve filesystem.

It is very strange that you have reserved filesystem available for
swapping. You should have already allocated such a free space in advance
as swap area.

>Some care would then
>be necessary to let the important jobs finish.  It may be necessary to continue
>jobs selectively instead of all at once, to prevent a repeat of the deadlock.
>Even if you need some more memory to get the mounting done, you could kill 
>some non-critical system daemon that could be started later (like lpd or 
>something).  Later on, you could decide to restart the daemon killed earlier, 
>and/or unmount no longer used filesystems.  Eventually, you could return to
>normal operation.

Who take care of all these things? Are you proposing to attach knowledgable
person all the time? A person who can understand what is deadlock seems
to be very uncommon even in this newsgroup.

>This is a little complicated but certainly doable and allows you to not
>reserve swap space on memory allocation and to use a true COW fork().

But it dose not worth doing so. Use vfork.

>People may complain that this is not a truly
>general solution, and I would agree.

Vfork is the true solution.

>However, combined with
>the added flexibility of no preallocation, it seems justifiable.

No. Vfork do no preallocation either.

>As a side note, on the large systems I work on, we don't do preallocation
>and have never run out of paging (swap) space.  This is not to say that
>we never will, but typical systems have on the order of at least 10 times as
>much disk storage as main memory.  In some cases, as much as 100 times more.

You have 100 times more swap space because you think it may be filled,
don't you.

>I claim it is hard to fill that much disk
>space with paging and swapping traffic and still have a usable system.  You'll
>probably be thrashing to death long before that.

As you may know, programs manupulating large arrays, if written properly,
can use very large virtual space with little real memory without
thrashing. That is why some of your system are configured 100 times
more swap space, isn't it?

						Masataka Ohta