Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!mcnc!rti!xyzzy!throopw
From: throopw@xyzzy.UUCP (Wayne A. Throop)
Newsgroups: comp.unix.questions
Subject: Re: Fork and Join, Pipe in C
Message-ID: <113@xyzzy.UUCP>
Date: Sat, 27-Jun-87 11:51:10 EDT
Article-I.D.: xyzzy.113
Posted: Sat Jun 27 11:51:10 1987
Date-Received: Sun, 28-Jun-87 04:44:14 EDT
References: <7737@brl-adm.ARPA>, <1186@ius2.cs.cmu.edu> <8174@utzoo.UUCP> <21685@sun.uucp>
Organization: Data General, RTP NC.
Lines: 60

> guy%gorodish@Sun.COM (Guy Harris)
> The only remaining reason to use vfork is that one has written ugly
> and brain-damaged code that depends on the semantics of "vfork".

But fork() can *never* be as efficent as vfork(), and sometimes this
efficency is crucial.  Even if you can create copy-on-demand during
fork(), the database needed to keep track of these pages consumes kernel
(possibly virtual) memory space, and the creation and upkeep of this
data consumes kernel CPU time.  The vfork() is hinting to the kernel
that an exec() is about to occur, and therefore the kernel is
well-advised not to invest much in the correct copy of the address
space.

That said, I agree that vfork is a horrible kludge.  What *should* have
been coined is the ability to create a process running a new executable
image in a single system call.  This is the only way the kernel can be
*assured* (rather than simply having it hinted at) that this process
creation will not involve the copying of any memory from the parent
process.  The fork() operation is still interesting and necessary, and
the exec() operation is still interesting and necessary, but so is the
create_process() operation.  If you think the operation is not
interesting and common, note how many zillions of library routines use
fork() or vfork() immediately followed by exec(), and provoke the kernel
into doing all sorts of unnecessary work, even if that kernel implements
copy-on-demand.

The counterargument is that each system call should do only one thing,
and they should be combined to make more complicated operations.  In
this sense, I agree that only fork() and exec() are needed.  But
engineers don't build out of nand-gates only, though that is all that is
logically necessary.  They use gates that more clearly convey intent,
and thus can gain significant efficency.  Similarly, create_process()
guarantees the kernel important information about the programmer's
intent that combinations of fork() and exec() do not, and in this case I
think the practical benefits of create_process() outweigh the
theoretical benefits of a "pure" "simple" set of system calls.

To finish off, let's look at the capabilities of fork() and exec() layed
out in a table:

                                load image      create process
        --------------------+---------------------------------
        do nothing          |   no              no
        fork()              |   no              yes
        exec()              |   yes             no
        create_process()    |   yes             yes

I think it is clear that there "ought" to be a create_process(), despite
the fact that it can be created out of fork() followed by exec(),
because despite all tricks, fork() needs to get a process memory image
from somewhere, and this will always come at some cost.

(Although, strangely enough, I will not argue that there ought to be a
 do_nothing() system call...)

--
An operating system is a set of manual and automatic procesures that
enable a group of people to share a computer installation efficently.
                                --- Per Brinch Hansen
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw