Path: utzoo!attcan!uunet!decwrl!bacchus.pa.dec.com!decvax.dec.com!mcnc!rti!dg-rtp!magic!rice From: rice@dg-rtp.dg.com (Brian Rice) Newsgroups: comp.unix.wizards Subject: Re: Sys V fork IS broken! Keywords: "What UUUUUUUUNIX meeeeans to meeeeeee..." Message-ID: <1990Jul30.002642.18244@dg-rtp.dg.com> Date: 30 Jul 90 00:26:42 GMT References: <480@amanue.UUCP> <13426@cbmvax.commodore.com> <573@oglvee.UUCP> <13435@smoke.BRL.MIL> <1990Jul28.195032.18746@watdragon.waterloo.edu> Sender: usenet@dg-rtp.dg.com (Usenet Administration) Reply-To: rice@dg-rtp.dg.com Followup-To: comp.unix.wizards Organization: Data General Corporation, Research Triangle Park, NC Lines: 112 In article <1990Jul28.195032.18746@watdragon.waterloo.edu>, tbray@watsol.waterloo.edu (Tim Bray) writes: |> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: |> > jr@oglvee.UUCP (Jim Rosenberg) writes: |> > -But if system calls fail simply because of a very temporary bout of activity, |> > -that is *not my problem*! It's the kernel's problem... |> > Oh, good grief. It is SILLY to say that the kernel should be redesigned |> > to compensate for bugs in application programs. |> |> I think [...] Doug Gwyn's comment is (unusually for him) wrong. |> |> Having write(2) fail because a disk is full is OK - there are several |> strategies which a program might reasonably adopt to handle this. But having |> fork() fail because of a likely-transient OS state is a stinking crock. My fingers almost made me redirect followups to this post to alt.religion.computers, because we are surely veering close to matters of faith. But I do think there's something to be said in defense of traditional fork. |> If there is a good chance that the kernel can fix this up without a gratuitous |> time delay, it should do so. If not (i.e. process creation has become |> impossible) the whole system is seriously sick and all the applications should |> ideally hear about this PDQ so they can start taking disaster relief |> measures. If the kernel has to make a call to fork fail, it does so for one of exactly two reasons: some system-imposed limit would be exceeded, or insufficient memory is available. That's all. Neither of these conditions means that the system is "seriously sick"; any process which isn't going to fork again and (in the second case) isn't going to do anything malloc'y need never even hear of the situation. If the system really is "sick"--i.e., some internal data structure is corrupted-- then the system is going to panic, *now*, and rightly so. (If the kernel can't believe its own internal data, how can it credibly notify processes to begin "disaster relief"? Admittedly, there's a bit of computer religion here: that programs should fail before they lie. But I think that sect has a great many adherents.) Conversely, a system isn't sick just because resources are under heavy contention. And, of course, the kernel tells you why your fork failed: you get EAGAIN or ENOMEM in errno. All told, this means that you, the application programmer, gets to choose what happens in the event of a fork failure, and you even get some information to help your application make the choice. That "Put the programmer in the driver's seat" orientation really is what UNIX means to me. |> And speaking |> from my experience in the application community, I think describing absence of |> special-purpose backoff & retry code for handling process creation failure by |> the OS as "bugs in application programs" is pretty arrogant and unrealistic. "Special-purpose backoff and retry code"? Can the kernel really do better than this? while ((child = fork()) == -1 && ++error_count < MAX_FORK_FAILURES) { switch (errno) { case ENOMEM: if (theres_some_junk_I_can_free()) { free(junk); break; } /* fall through */ case EAGAIN: sleep(MAYBE_LIFE_WILL_BE_NICER_IN_THIS_MANY_SECONDS); break; default: FatalError("Argh! The man page lied! #@!$& phone company OS!"); exit(1); } } if (child == -1) { FatalError("Waaah! The kernel won't let me fork!"); exit(1); } Well, maybe the kernel could queue each fork request that it was unable to complete and then satisfy each request in order...or maybe it could satisfy the smallest request first, with some kind of aging mechanism to keep from starving forks of big processes, etc., etc....this would get complicated, clearly, and might even require so much overhead as to provoke thrashing. But maybe you could do it. If you could, then how would you deal with the person who said, "Wait--if the system is low on memory, I don't want my fork retried; I want to hear about it so I can go off and do something else (maybe just sleep), then retry"? This is the person who liked the old fork, and there are lots of such folk. Looks like you'll have to add an old-fork-behavior flag, and then you'll have two kinds of forks, some on a queue and some not, and all wanting resources... Clearly, this way lies VMS$MADNESS. Let's hear it for minimal function calls with clean interfaces, even if they necessitate a few more lines of application code. After all, *you* get to write that code, and you can package it up into a library function if you don't want to type it more than once. Brian Rice rice@dg-rtp.dg.com +1 919 248-6328 DG/UX Product Assurance Engineering Data General Corp., Research Triangle Park, N.C.