Xref: utzoo comp.unix.i386:893 comp.unix.wizards:18833 Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!gem.mps.ohio-state.edu!apple!rutgers!att!cbnewsh!dwc From: dwc@cbnewsh.ATT.COM (Malaclypse the Elder) Newsgroups: comp.unix.i386,comp.unix.wizards Subject: Re: Help! Altos 5.3.1 fork is failing! Message-ID: <4973@cbnewsh.ATT.COM> Date: 21 Oct 89 06:57:22 GMT References: <506@oglvee.UUCP> <2296@hcr.UUCP> <508@oglvee.UUCP> Organization: The Legion of Dynamic Discord Lines: 38 i originally sent a reply to the poster of the question stating that the reason that getcpages is failing trying to get 1 contiguous page is that there is probably no free memory for a page table. its been a while since i looked at the problem but i seem to remember that the reason getcpages() can fail without sleeping is to prevent deadlock-type situations. on release 3, there are certain process data structures that are not swapped out: the ublock (depending on the version), the page tables and DBDs, and maybe more. well, you can get into a situation of deadlock in which all memory is committed to these data structures and no process can continue because it they are all both holding memory and waiting for more. allowing the sleep to happen is okay if you make the sleep interruptable. then at least the user can attempt to abort his program voluntarily (the problem is determining when you are in this deadlock situation... you can't run user level programs to tell you this). my solution to this was really very simple. at fork time, the parent knows how much memory resources it will take to create this process (ublock, page tables, dbds, etc.). with this information, the parent can check freemem level and reserve the necessary amount of memory to satisfy the fork and sleep until that amount of memory is available. this sleep is safe since no resources have been committed to the child yet (the child doesn't even exist). we prototyped this for release 3 and it was going to go into some future release when they decided to use sun's VM architecture instead of regions. i suspect that release 4 will have a similar problem but i'm not sure. if you don't have source to modify, i suggested to the original requestor that he set a very high value for GETPGSLO and GETPGSHI. this will make the paging daemon very active and MAY prevent you from hitting situations where freemem goes to zero. its not guaranteed since requests for freemem is VERY bursty and the reaction time of vhand is fairly slow. danny chen att!hocus!dwc