Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!mcsun!unido!mikros!mwtech!martin From: martin@mwtech.UUCP (Martin Weitzel) Newsgroups: comp.unix.wizards Subject: Who is responsible for a retry (was Re: Is System V.4 fork reliable?) Message-ID: <866@mwtech.UUCP> Date: 29 Jul 90 14:21:52 GMT References: <561@oglvee.UUCP> <480@amanue.UUCP> <13426@cbmvax.commodore.com> <573@oglvee.UUCP> <7885@tekgvs.LABS.TEK.COM> Reply-To: martin@mwtech.UUCP (Martin Weitzel) Organization: MIKROS Systemware, Darmstadt/W-Germany Lines: 47 In article <7885@tekgvs.LABS.TEK.COM> terryl@sail.LABS.TEK.COM writes: [many wise words about KISS principle and the spirit of unix deleted] But IMHO it's not quite appropriate here. I think the questions here is: Who should retry if a fork fails? To see the problem I think that we should generalize a little. Just consider the case of disk reads for a moment. Surely, there's no one of us who doesn't appreciate the ability of the device drivers to issue retrys(%) if a read fails, and that an error from a read in an application can be considered to be a permanent error. (%: Maybe, if I were about to write a program which tests for flaky disk blocks, I'm not so happy with kernal retries ...) Of course, an application can choose to retry after bad reads and I've had cases of "ill" disks, where running a program in the background for some hours helped me to recover 100 % of "bad blocks" by patiently retrying ... just 1 out of 100 reads or so happened to be succesfull. On the other hand I would never embrace disk reads in "normal" programs with a retry capability - why bother: the kernal-drivers solve the problem in general well. Now, why is the situation so different with "fork"? As I understand all the traffic here, the "real" problem is in fact that in case of the E_AGAIN-error two very different problems may exist: The one is more a "long-term" problem (no slots in the process table or user limited reached, where this could also be zombies caused by careless programming techniques), the other is a very short-term problem, which is difficult to correct in the kernal because the complexity of the algorithms in that area. So I think the complaints here *are* right from the view of an application developper, but instead of embracing all the forks in application programs with a retry capability, I think there's a more pragmatic (though not ideal) approach: Why not enhance the interface to fork in the standard library with a retry capability? For many of us, "library + kernal" are more or less a monolithic block (we can't change both easily 1/2:-)), so if an error from fork could be treated as the described "long-term" error condition, everything were fine. Well, only a suggestion, maybe someone will post such a piece of code soon ... -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83