Path: utzoo!attcan!uunet!cs.utexas.edu!mailrus!bbn.com!lkaplan
From: lkaplan@bbn.com (Larry Kaplan)
Newsgroups: comp.arch
Subject: fork() vs vfork() and COW
Keywords: copy-on-write (COW)
Message-ID: <58053@bbn.BBN.COM>
Date: 9 Jul 90 14:09:41 GMT
References: <1122@sirius.ucs.adelaide.edu.au> <3830012@hpcupt1.HP.COM>
Sender: news@bbn.com
Reply-To: lkaplan@BBN.COM (Larry Kaplan)
Organization: Bolt Beranek and Newman Inc., Cambridge MA
Lines: 69

I wrote:
>> The idea behind 
>> copy-on-write is that a proper fork() requires NO EXTRA physical memory for 
>> the child process (except that required for kernel data structures and page 
>> tables).  You probably end up needing one stack page pretty soon. 
>> This implementation of fork() is
>> what fork() was always intended to be, as far as I can tell.  By doing fork()
>> correctly, the need for a separate vfork() disappears (as stated in the BSD
>> man pages).

In article <3830012@hpcupt1.HP.COM> renglish@hpcupt1.HP.COM (Robert English) 
writes:

>Ain't necessarily so, for a couple of reasons.
>
>First, while vfork() was a hack intended to get around the absence of a
>copy-on-write scheme, it has different semantics from a regular fork(),
>and those semantics can sometimes be useful.  /bin/csh, for example,
>takes advantage of them.  Such use may not be wise, but it does exist,
>and if you just get rid of vfork(), there will be programs that break.
>Not many, but some.

While this is true, it is very unfortunate.  This problem is of the RTFM
(read the @#$%& manual) form.  It says right there:
"Users should not depend on the memory sharing semantics of vfork() ..."
If they do, the code is "incorrect".  We did have to spend some small amount
of time addressing this problem in various system programs.

>Second, and more important from an architectural point of view,
>vfork-exec is usually faster than fork-exec, even when copy-on-write
>is implemented.  Whenever you fork a process, you have to copy its data
>structures.  For a large process, the work of setting up an entire
>virtual memory structure for a process and then immediately tearing it
>down can be significant, even when the process's data is not actually
>copied.  

Copying system data structures is indeed a non-zero cost operation.  However,
typical Unix programs have a relatively simple memory structure.  Using MACH
style maps, entries and objects requires very little to describe the process's 
address space.  The data structure allocations would require about six zone
allocations (from preallocated linked lists) and filling in the fields of these
structures.  This can't take very long and seems worth the effort.  As long
as the process has this Unix style map (only 3 segments), this method is
virtually fixed in cost no matter what the address space size.  Adjusting
the page tables of the parent to support copy-on-write (COW) can take some time 
proportional to the address space currently used in the parent but is again 
a relatively cheap operation.  While I can't claim the fork-exec with COW
is faster than vfork-exec, I claim that it is not significantly slower in
most cases and the COW implementation provides significant advantages in other 
areas such as parallel programming.  Note that vfork() has other problems
in certain multiprocessing memory architectures.

>
>Finally, all of this discussion misses the point that the normal
>performance-critical sequence is a fork followed by some kernel data
>structure manipulations and an exec, and that both performance and
>purity could be achieved by providing a single system call that performs
>the whole sequence.

Seems true, though I don't know of anyone who has done this in UNIX.  One 
problem might be the file descriptor maintanance and other little things the 
parent likes to have done in the child before it execs.

#include <std_disclaimer>
_______________________________________________________________________________
				 ____ \ / ____
Laurence S. Kaplan		|    \ 0 /    |		BBN Advanced Computers
lkaplan@bbn.com			 \____|||____/		10 Fawcett St.
(617) 873-2431			  /__/ | \__\		Cambridge, MA  02238