Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!aplcen!samsung!brutus.cs.uiuc.edu!apple!motcsd!dms!albaugh
From: albaugh@dms.UUCP (Mike Albaugh)
Newsgroups: comp.arch
Subject: Why fork? (long, was Re: IBM PC prehistory)
Message-ID: <952@dms.UUCP>
Date: 16 Jan 90 17:56:12 GMT
References: <610@ssp11.idca.tds.philips.nl>
Organization: Atari Games Inc., Milpitas, CA
Lines: 83

From article <610@ssp11.idca.tds.philips.nl>, by willy@idca.tds.PHILIPS.nl (Willy Konijnenberg):
[quoting a lot of people arguing about MMU's and the needs of *nix, then
getting to the heart of the matter:]

> I don't think you should try to think of relocating a program once
> it has been running for a while. You have no way of knowing what it is
> doing with pointers.

	Precisely why systems like the MAC (and minix on the PC) use
only segment-relative addressing. Well, actually, it gets a little more
complicated on the MAC, but believe me you don't want to know :-)

> When you run a unix-like system, there is one additional point where
> this scheme slows the system down, in addition to the relocation work
> during program load.
> As Craig noted, when the program fork()s, you have two programs that need
> to be located at the same virtual (== physical with no MMU) address space
> to run, so for every context switch, you must check whether the program is
> at the proper place and if not, swap things around (in memory, not necessarily
> to disk), which dramatically increases context switch overhead.

	Although not mentioned here, someone else on this thread (mis)stated
that the MMU was also needed for protection, which is not strictly true.
A scheme like that implemented on the early mid-range 360's can provide
protection without relocation, and can do it in _parallel_ with the fecth,
so there is no performance penalty. There will always be a performance
penalty for relocation, but it may be masked by other, still slower, parts
of the memory system. I just wanted to get that point out of the way early.

> Fortunately, this is normally not much of a problem, since usually a program
> does an exec() shortly after the fork() and this exec() can fix the problem.
> 
> This scheme is not very elegant, but it allows one to run a unix system
> on hardware like ST, Mac and Amiga.

	SO--- Why do we _still_ use fork() for all these near-trivial
cases. I have been mucking around with computers for over 20 years but am
not really familiar with *nix. I would like a reality check on this.
I'm also not asking on comp.unix... because I'm afraid that would be
like asking pointed questions about the trinity in a seminary :-) Since
comp.arch folk have to deal with _implementing_ this stuff, I thought I'd
get a more reasoned response ( 1/2 :-). Anyway, I can see a few reasons
to use fork:

1) It can be used for part of spawn _and_ for actual task-splitting
	(problem subdivision). Why have two calls when one will do?
2) On machines that are only (or mainly) swapping anyway, there is no
	penalty, so what the heck.
3) By just (effectively) copying the entire memory space, we don't
	need to keep track of just which parts actualy _need_ to be
	passed to the new task (laziness as a virtue :-).
4) "We have always done it this way".

(my personal feelings are that 1 & 2 were the original reasons while
3 & 4 are the reason we are stuck with it now)

Against this we have the problems mentioned above with handling *nix
programs on machines without dynamic relocation. Also, even machines
that _can_ do relocation don't get fork for free:

1) Machines with base/bounds registers may need to copy the whole
	memory image to a new area. If they have two sets (e.g. KA10)
	they might get away with "only" copying the data segment.
2) Paging machines still need to at least mark all data pages
	"copy on write", which may involve traversing the segment
	and page tables in software. For a large image this can
	be time consuming. Also, I'd imagine it's a real judgement
	call whether to deal with a page at a time or just punt and
	do the copy as soon as "enough" of the image has changed to
	make write-trap handling a nuisance.

	And all this hassle so that three or four instructions later the
program can overlay itself (most of the time). I must be missing something
major here. Can someone tell me what?

				Mike

> 	Willy Konijnenberg		<willy@idca.tds.philips.nl>

| Mike Albaugh (albaugh@dms.UUCP || {...decwrl!pyramid!}weitek!dms!albaugh)
| Atari Games Corp (Arcade Games, no relation to the makers of the ST)
| 675 Sycamore Dr. Milpitas, CA 95035		voice: (408)434-1709
| The opinions expressed are my own (Boy, are they ever)