Xref: utzoo comp.unix.i386:873 comp.unix.wizards:18793 Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!pt.cs.cmu.edu!cadre.dsl.pitt.edu!pitt!amanue!oglvee!jr From: jr@oglvee.UUCP (Jim Rosenberg) Newsgroups: comp.unix.i386,comp.unix.wizards Subject: Re: Help! Altos 5.3.1 fork is failing! Message-ID: <508@oglvee.UUCP> Date: 19 Oct 89 16:14:29 GMT References: <506@oglvee.UUCP> <2296@hcr.UUCP> Reply-To: jr@oglvee.UUCP (Jim Rosenberg) Organization: Oglevee Computer Systems, Connellsville, Pa Lines: 60 In article <2296@hcr.UUCP> larry@zeus.UUCP (Larry Philps) writes: >Getcpages, is indeed get "contiguous" physical pages. There are parts of the >paging system on some processors that require this. The complaint about a >failure on 1 page simply means that ALL RAM was being used when the fork >appeared and the system needed a page to hold page tables or the like. > >Now, for some reason unknown to me, in fork (procdup actually), dupreg is >called with arguments that specify that it is not to sleep. I couldn't come >up with any sensible reason why this had to be, so I changed the call to >allow sleeps. The fork failure problems simply went away, and no other >problems manifested. OK, kernel gurus, what's the word: *is there* a good reason why the call to dupreg shouldn't sleep??? We are also running V.3.2 on a bunch of AT&T 6386en. Those machines have only 2M RAM. I know damn well that we're just on the borderline of what's doable with that little memory -- it's a budget issue, not a technical issue. Although I do often suffer from the overhead of paging, I've *NEVER* seen a fork failure on these machines. Admittedly this is V.3.2 and not V.3.1. But I wonder if AT&T did go ahead and change the dupreg call to allow a sleep. Can someone from AT&T comment? I must say this, though: while I've never seen an identifiable fork failure on one of the 6386en, I *have* seen a phenomenon which I call Kernel Narcolepsy: the whole system just seems to fall asleep now and then. I had one machine a couple of months ago that had an extremely sick disk. To make sure another machine didn't have the problem I intentionally loaded it with enough continuous compiles of our database language (Progress) to cause solid thrashing. Every now and then the thrashing would just stop. After about 5 minutes it would pick up again. I don't know for a fact that it was really sleeping: it could have been a kind of beat frequency where the processes just happened to hit on the same pages. But I did suffer one definite case where the whole system went to sleep and even though characters would echo I could get no response from any getty and the system was definitely just plain stuck. This took a full reboot, fsck found minor damage, etc. etc. So I guess the question is this: If the dupreg call from fork allows sleeps, could this lead to a deadlock? Is it possible I may be seeing this on V.3.2? If the dupreg call *can be* safely changed to allow sleeping then my Altos problem is a flat out case of a bug in their System V.3.1. If it *can't* safely be changed, then as I understand the situation V.3 DOES NOT RELIABLY IMPLEMENT VIRTUAL MEMORY!! Is it not true that pages are freed by an asynchronous kernel process? Is it not true that, given the indeterminate way things work in UNIX, one cannot absolutely guarantee when this process will run? If you can't allow a sleep from fork in dupreg then the only way of guaranteeing that fork won't fail is to guarantee that you don't page. I.e. if you page, you run a certain risk that forks will fail no matter how much swap space you have. The only way to guarantee fork will never fail is to guarantee you don't page. I.e. don't really exercise virtual memory. I.e. V.3 virtual memory is NOT RELIABLE because if you use it you may trigger fork failures. Please tell me it ain't so!!!!! -- Jim Rosenberg pitt Oglevee Computer Systems >--!amanue!oglvee!jr 151 Oglevee Lane cgh Connellsville, PA 15425 #include