Newsgroups: comp.unix.aix Path: utzoo!utgpu!dennis From: dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) Subject: Re: malloc (was: making a request to IBM) Message-ID: <1991Apr14.030748.18052@gpu.utcs.utoronto.ca> Keywords: malloc psalloc paging space Organization: none References: <1991Apr9.024814.1141@appmag.com> <6644@awdprime.UUCP> Date: Sun, 14 Apr 1991 03:07:48 GMT In article <6644@awdprime.UUCP> mbrown@testsys.austin.ibm.com (Mark Brown) writes: >| The problem: as you all remember, malloc() returns NULL only >| when the process exceeds its datasize limit. If malloc returns a >| non-null pointer, the memory may turn out to be exceedingly >| virtual: there won't be any paging space behind it. AIX runs >| out of paging space when the process actually uses the memory. >| Various processes die. In Info, see `List of Books', `General >| Concepts and Procedures', scroll ~1/3 down, `Paging Space >| Overview'. See also psmalloc.c in /usr/lpp/bos/samples. Etc etc >| etc. >| >| Personally, I think it's a bug. If there is no memory left, >| malloc should return a NULL. IBM says it's a feature, catch >| SIGDANGER if you don't like it. > >Yeah, I've heard complaints (and roses) on this one. >The Rationale: Rather than panic the machine, we'd like for it to keep >running as long as possible. Hence, we try to keep running at all costs, >including doing things like this. So, when we do get close to the limit, >we send a warning, than as we go over we start killing the biggest memory >users. (Warning - this processes involved have been overly simplified). > >The Idea was to make the machine 'more reliable'. Our research led us >to believe that many processes allocated more memory than actually used in >page space (I think) and we used this knowledge. Understandably, many >UNIX users either a) want the machine to panic, "like UNIX does"; or >b) hate our algorithm for killing jobs. I also think we don't advertise/ >document the process involved enough to make it useful to users. > >So, do we go back to blowing up processes that allocate too much memory, >even though that memory may actually be there by the time the process >actually uses it? Do we go back to 'panic' when page space fills? There are >reasonable arguments for doing this... I'm old enough to have used vanilla Version 7 Unix when PDP-11s were in vogue, and to be brutally frank the only Unix I can remember using which panic'd when it ran out of memory was an early AIX on an RT, a system which I hardly think qualifies as The Definitive Unix. The behaviour of AIX is, from the user's perspective, a whole lot like the behaviour of vanilla System V Unix, which also kills off random processes when it runs out of memory (or used to, at least, I haven't paid attention much recently). The only IBM value-added bit in this is the signal (to be fair, I do understand that the backing store allocation policy is different internally than System V, and is actually more conservative. Looks pretty similar from the user's perspective, though). BSD Unix doesn't (a) panic, or (b) kill processes, I suspect what the users who are complaining want is (c) malloc() to return NULL when the machine runs out of memory, without panicing and without random processes being killed (it is actually easier to do it this way than to do either what System V or what AIX does). Better to explain more exactly why AIX does what it does. It's so vendors who want to sell crufty old Fortran programs which have no way to do dynamic memory allocation, can ship binaries with huge static arrays compiled in for people who want to solve big problems and still have the same binaries run on small machines to solve small problems. To implement this you don't allocate backing store until a page is touched, which means malloc() can't return NULL since it can't, in general, know if the Fortran program running at the same time is actually going to use his pages or not. You should understand, however, that killing off processes isn't the "real" problem. People have used System V machines which do this for years without complaining because, on your typical Unix box being put to typical uses, running out of memory/page space is a rare occurance. On an AIX machine, however, with its humungous kernel and things like the compiler and loader which consume prodigious amounts of memory when running, running out of memory can be a daily occurance. People don't complain about System V because they never find out what happens when memory runs out. With AIX, however, your average user ends up painfully aware of how the system behaves when memory is used up, and so he complains. The real bug is that AIX is a memory pig. It would be useful to fix this one. Dennis Ferguson University of Toronto