Path: utzoo!attcan!lsuc!eci386!jmm From: jmm@eci386.uucp (John Macdonald) Newsgroups: comp.unix.wizards Subject: Re: Sys V fork IS broken! Message-ID: <1990Jul31.070049.21446@eci386.uucp> Date: 31 Jul 90 07:00:49 GMT References: <573@oglvee.UUCP> <13435@smoke.BRL.MIL> <1990Jul28.195032.18746@watdragon.waterloo.edu> <13447@smoke.BRL.MIL> Reply-To: jmm@eci386.UUCP (John Macdonald) Organization: Elegant Communications Inc. Lines: 56 In article <13447@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: |In article <1990Jul28.195032.18746@watdragon.waterloo.edu> tbray@watsol.waterloo.edu (Tim Bray) writes: |> if ((child = fork()) == -1) |> FatalSystemError("Serious system trouble! Can't create process!"); |>... I think describing absence of special-purpose backoff & retry code for |>handling process creation failure by the OS as "bugs in application programs" |>is pretty arrogant and unrealistic. | |The bug is that your application makes no attempt to recover from a known |class of error, EAGAIN in this case. [...] Well, yes, but ... This is a "known class of error" that has been added to the meaning of fork over the years. (I don't know when in the various branches of the family tree, but probably usually around the same time that support for memory paging was being added.) I tend to concur with an earlier poster who suggested that returning EAGAIN even when it is only a temporary lack of resources that is a problem would be analogous to returning EAGAIN just because the disk buffer cache is temporarily full instead of just putting the process to sleep while a buffer cache entry is emptied on its behalf. Obviously, from the intensity that is being used to suggest that this really is an error, the situation is not that simple. Could someone please explain why. Is it too difficult (or impossible) to distinguish between a transient and a deadlocked the lack of resources? Or would the people claiming that "this is a policy decision that should not be in the kernel" also claim that having the kernel automatically wait for a buffer cache is a mistake according to the same design philosophy? (If not and if it is not just because of practicle considerations like detectability and certainty of success, what is the difference?) While Doug claims that Tim's code above ignores a known class of error, this was not always a known class of error - in earlier versions of Unix it was not a class of error at all. Certainly, from the perspective of someone who has been writing code using fork since version 5 days, I can admit that I have never before noticed the change from "error from fork is usually not recoverable" to "error from fork is possibly recoverable if you try again in a while" between S3 and S5. Perhaps a document should be written for new system releases giving changes to programming practice that should be used - it could contain any change that has required a significant proportion of the standard program set to be examined and fixed for the new (or newly noticed) desired programming method. Requiring programmers to change their normal programming practices should not be done without justification (which I think can be provided in this case), and without clear explanation (which is often lacking). -- Algol 60 was an improvment on most | John Macdonald of its successors - C.A.R. Hoare | jmm@eci386