Path: utzoo!attcan!uunet!tut.cis.ohio-state.edu!pt.cs.cmu.edu!dsl.pitt.edu!pitt!amanue!oglvee!jr From: jr@oglvee.UUCP (Jim Rosenberg) Newsgroups: comp.unix.wizards Subject: Is System V.4 fork reliable? Message-ID: <561@oglvee.UUCP> Date: 6 Jul 90 20:01:56 GMT Organization: Oglevee Computer Systems, Connellsville, Pa Lines: 43 Somewhere along in the development of System V, fork became an unreliable system call. At least it is on my (V.3) system. I asked the net about this, and after some completely wrong answers that we were out of swap space, the story that emerged was (to cite an old posting, Jerry Gardner in <3696@altos86.Altos.COM>): > The fork() failures you are seeing are occurring when procdup() calls > dupreg(). Dupreg() calls ptsalloc() which eventually calls getcpages() to > allocate memory for page tables to map the new child process' u-area. > Apparently, the kernel is paranoid in one place here and it calls ptsalloc > with a flag that doesn't allow it to sleep. Apparently if sleep were allowed a deadlock could occur. The result is that an intensive burst of activity can cause fork to fail, even though really the system is *not* out of resources and ought to be able to handle it. (Since the page-stealing daemon is asynchronous, you can never guarantee *exactly* when it will run.) Numerous people suggested more RAM as the cure. Right. What that amounts to saying is, "Get enough RAM so that you *NEVER* page." I.e. V.3 has virtual memory, but don't assume you can really use it. The number of utilities that both use fork and also understand that under some circumstances it ought to be *retried* if it fails is pitifully small. (The shell simply reports the bogus message "No more processes". On my system when cron incurs a fork failure it logs that it is "rescheduling" the job. Right. cron "reschedules" into oblivion, ceasing to run *any* jobs.) My question is: Is this *FIXED* in V.4? I went to the V.4 internals tutorial at Usenix in D.C. V.4 does have an asynchronous page-stealing daemon and does have a kernel memory allocate call with a flag to either sleep or not sleep. Do any of the kmem_alloc() calls (if I remember the name right, I don't have my notes handy) resulting from fork *not* allow sleep? If so I believe that would also give V.4 the lovely V.3 feature of unreliable fork. And in this I-hope-not case, the man page for fork(2) should at least tell the truth and make clear the circumstances under which fork should be retried. And all the utilities which fork should be hacked to actually do those retries. --- Jim Rosenberg #include --cgh!amanue!oglvee!jr Oglevee Computer Systems / / 151 Oglevee Lane, Connellsville, PA 15425 pitt! ditka! INTERNET: cgh!amanue!oglvee!jr@dsi.com / /