Path: utzoo!attcan!utgpu!watserv1!maytag!watdragon!watsol.waterloo.edu!tbray From: tbray@watsol.waterloo.edu (Tim Bray) Newsgroups: comp.unix.wizards Subject: Sys V fork IS broken! Message-ID: <1990Jul28.195032.18746@watdragon.waterloo.edu> Date: 28 Jul 90 19:50:32 GMT References: <480@amanue.UUCP> <13426@cbmvax.commodore.com> <573@oglvee.UUCP> <13435@smoke.BRL.MIL> Sender: daemon@watdragon.waterloo.edu (Owner of Many System Processes) Organization: University of Waterloo Lines: 40 gwyn@smoke.BRL.MIL (Doug Gwyn) writes: jr@oglvee.UUCP (Jim Rosenberg) writes: -But if system calls fail simply because of a very temporary bout of activity, -that is *not my problem*! It's the kernel's problem... Oh, good grief. It is SILLY to say that the kernel should be redesigned to compensate for bugs in application programs. I've been earning my living writing application programs on Unix for some years. Sometimes application programs need to fork(). (in fact, an informal scan of my memory fails to reveal an important non-trivial application that never does a fork() (and the semantics of fork() are just right and one of the best things about unix (and those who talk about the need for a spawn() or a run() call should spend a few years in the SYS$_CREPRC mines (sorry for the digression))). Every application I've written, and every other one I've seen (aside from amateurish toys that don't check return codes) forks about like this: if ((child = fork()) == -1) FatalSystemError("Serious system trouble! Can't create process!"); else if (child == 0) { /* child */ } else { /* parent */ } I think this is right and Doug Gwyn's comment is (unusually for him) wrong. Having write(2) fail because a disk is full is OK - there are several strategies which a program might reasonably adopt to handle this. But having fork() fail because of a likely-transient OS state is a stinking crock. If there is a good chance that the kernel can fix this up without a gratuitous time delay, it should do so. If not (i.e. process creation has become impossible) the whole system is seriously sick and all the applications should ideally hear about this PDQ so they can start taking disaster relief measures. I don't really think there's a middle ground here. And speaking from my experience in the application community, I think describing absence of special-purpose backoff & retry code for handling process creation failure by the OS as "bugs in application programs" is pretty arrogant and unrealistic. Cheers, Tim Bray, Open Text Systems, Waterloo, Ont.