Path: utzoo!attcan!uunet!tank!ncar!ames!oliveb!Ozona!chase From: chase@Ozona.orc.olivetti.com (David Chase) Newsgroups: comp.lang.c Subject: Re: memory allocation Message-ID: <29003@oliveb.olivetti.com> Date: 14 Sep 88 01:09:43 GMT References: <1262@micomvax.UUCP> <499@poseidon.UUCP> Sender: news@oliveb.olivetti.com Reply-To: chase@Ozona.UUCP (David Chase) Distribution: comp.lang.c,comp.os.misc Organization: Olivetti Research Center, Menlo Park, CA Lines: 69 In article <499@poseidon.UUCP> psrc@poseidon.UUCP (Paul S. R. Chisholm) writes (quoting a letter from Van Wyk to Bentley): > times). I added the minimal bookkeeping necessary to keep my > own queue of free nodes of this particular kind, and reduced > the memory allocator's share of the time to about thirty > percent. >The key point here is that Van Wyk measured his program (with a >non-trivial test input), found where the program spent its time, and >*then* optimized it. This is one good approach (I've used it several times); another (if you simply must write your own memory allocator) is to use somebody else's benchmarks to ensure that you at least implement a good algorithm. Before writing yet another allocator, I suggest that (as a minimum) you read "Dynamic Memory Allocation in Computer Simulation" by Norman R. Nielsen, in CACM 20:11 (November 1977). I don't know if anyone has done a similar study of allocation techniques that have been invented since then, but I'd like to hear about it if they have. Here, I'll even save you the trouble. By most measures, the best algorithm maintains a free list for each node size that is requested (NO ROUNDING UPWARD EXCEPT FOR WORD ALIGNMENT). This algorithm was typically fastest (though buddy blocks has better worst-case performance) and nearly as effective in limited memory situations ("limited memory" = "sum of user requests is 80% of what is available, ignoring all overhead imposed by allocator"). On the "base demand list", this algorithm required only 85.6% of available memory to satisfy a request for 80%; the best required 83.6%, but was 10 ten times slower. On a more rigorous set of 18 test loads, this algorithm was best at 50% (succeeded 18 times, max of 42 IBM 360/65 microseconds for alloc-dealloc), best at 66.7 % (succeeded 18 times, 42 usecs) and probably best at 80% (succeeded 15 times, 169 uSecs). I say probably best because only selected algorithms were given the full treatment, and some other might have been more successful. However, of the algorithms given the full treatment, the most successful faster (worst-case) algorithm only succeeded 7 times. I should add that I've implemented a couple along these lines, and played with a few that other people have implemented, and I am happiest with the approach taken in the Boehm-Demers garbage collector (available from the Rice archive server*, I think). It uses multiple free lists for blocks below a certain size (4K - overhead), and a single free list for blocks larger than that. It can easily be convinced not to try to collect garbage, but unless you take delight in writing truly squirrelly code, it will correctly take out the trash. For more information about this collector, see Boehm, H., and M. Weiser, "Garbage Collection in an Uncooperative Environment", Software Practice & Experience, to appear, September 1988. Weiser experimented with a special version of the collector for tracing storage leaks; typically, the collector does a better job than programmers. I realize that electronic mail is at your fingertips, but I know that there are university libraries containing the cited material in Silicon Valley, Austin, Houston, and Boston, and probably other places besides those. I can't really recommend the UT or UH campuses, but walks through Stanford, Rice, MIT, or Harvard are quite pleasant. Get off your computer butts and go read some of this stuff. * mail -s "help" "archive-server@rice.edu" < /dev/null mail -s "send index sun-source" "archive-server@rice.edu" < /dev/null David Chase Olivetti Research Center, Menlo Park, CA