Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!pt.cs.cmu.edu!cadre.dsl.pitt.edu!pitt!amanue!oglvee!jr From: jr@oglvee.UUCP (Jim Rosenberg) Newsgroups: comp.unix.i386 Subject: Re: Help! Altos 5.3.1 fork is failing! Message-ID: <509@oglvee.UUCP> Date: 19 Oct 89 17:00:27 GMT References: <506@oglvee.UUCP> <4219@cuuxb.ATT.COM> Reply-To: jr@oglvee.UUCP (Jim Rosenberg) Organization: Oglevee Computer Systems, Connellsville, Pa Lines: 62 In article <4219@cuuxb.ATT.COM> dlm@cuuxb.UUCP (Dennis L. Mumaugh) writes: >Ordinarily I don't answer questions like this as I work for >support and customers pay money for answers, but .... Thank you for going above and beyond the call of duty. Since I have an unreliable operating system for which we paid real money, it's a comfort to know we don't have to pay more real money to find out how to get relief from the defects in what we already paid our money for. >In article <506@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) >writes: > > What the bleep is getcpages? > > [...] > > How could it fail on a request to get only 1 page unless > I'm out of swap space? > >How did you guess? Are you *ABSOLUTELY* sure this is the only way getcpages can fail??? I already have one response to the contrary. > (Which I'm not. We're getting these with many many > thousand blocks of free swap space -- we have a swap(1) > which will show these.) > >Not true! /etc/swap only shows actual use of swap not committed use >of swap. Similarly for sar reports. OK, you can tell me all you like that swap is broken and is lying to me and that sar is broken and is lying to me (these are *my* fault???) and that I really really am out of swap space, but frankly I just don't believe this. I *DID* add a new swap partition with swap -a (*before* posting the original article, as a matter of fact.) The system is clearly using it. I got one fork failure with no interactive users logged in -- we had 4 database servers up and one client batch job, which had three or four child UNIX processes -- enough to page a bit perhaps but nowhere *NEAR ENOUGH* loading to exhaust 24,000 blocks of swap space. If my swap space runs out with lots of users then I can deal with that, but if that were my problem then the whole system would come crashing to its knees many times a day. I'm sorry, but I just don't believe you're right that every fork failure happens because I truly am out of swap space. >True, some code isn't very robust and ought to sleep and wait for >less load, but people who do forks don't examine error codes, nor >do people who do execs. fork and exec will return either ENOSPC or >EAGAIN if you would check errno. ^^^ If **WHO** would check errno??? I beg your pardon? I am supposed to dig into cron with a can opener (we are a binary licensee, not source!) and somehow "check" errno? When I get a fork failure from a fork issued by cron it cutely logs the fact that fork failed, and that it is "rescheduling". Right. It then just falls asleep and no more cron jobs run. When csh gets the fork failure it simply reports "No more processes". Um, just what would you like me to check here? It's *you folks in AT&T* who should check errno, don't you think? -- Jim Rosenberg pitt Oglevee Computer Systems >--!amanue!oglvee!jr 151 Oglevee Lane cgh Connellsville, PA 15425 #include