Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!apple!sun-barr!decwrl!shelby!neon!carcoar!wilson From: wilson@carcoar.Stanford.EDU (Paul Wilson) Newsgroups: comp.object Subject: Re: [generational] Garbage collection -- difficulty, cost, generality Summary: generational gc's are different; watch out for benchmarking traps Keywords: garbage collection times, generational gc Message-ID: <1990Mar15.231331.29545@Neon.Stanford.EDU> Date: 15 Mar 90 23:13:31 GMT References: <1990Mar13.011146.6019@Neon.Stanford.EDU> <2356@syma.sussex.ac.uk> Sender: news@Neon.Stanford.EDU (USENET News System) Reply-To: wilson@carcoar.Stanford.EDU (Paul Wilson) Organization: U. of Illinois at Chicago (UIC, *not* UofC or UIUC) Lines: 75 In article <2356@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk (Aaron Sloman) writes: > I (Paul Wilson) write: > >> ...................Don't expect a fast language on stock >> hardware to have gc speed costs of less than about 10% on average. >> I think in the 8-15 percent range myself. > .......... > >I thought 8-15 percept was rather high, so I thought I would do some >experiments in Pop-11 (a lisp like language with a Pascal-like >syntax) in the Poplog system on a Sun4/260 (stock hardware I think). > > [ ...description of programs run, short resulting gc times...] Two points: I probably didn't make it clear in my posting that I was referring to *generational* gc's. They have a higher base overhead, but perform better when faced with large, data-hungry applications. (They also tend to have short pause times that are less likely to intefere with interactive response.) Also, I'm unclear about the system you're running -- I don't know much about Pop. The programs you're testing appear not to stress a gc much. This is likely because they're short and they don't ever have much live data. Running a small program several times is NOT at all like running a big program once. Such experiments can be *very* misleadingly optimistic. (That's one of the reasons why I'm trying to collect scalable (Scheme) programs for benchmarking gc's on different problem sizes.) Think about it this way. After every execution of a program, all of its intermediate data objects, and probably its results, are garbage. They will incur _no_ copying cost. If you execute small programs repeatedly between gc's, you generally only copy live data for those few iterations that a gc happens to occur during. Objects from other iterations are all dead. Running programs that do a lot of number consing is particularly bad, because those objects are typically short-lived and will flatter your gc efficiency. (Unless you store pointers to the heap-allocated numbers into objects in an older generation. Then your performance can get bad fast.) Garbage collection times correspond roughly to the average amount of data that is live during execution. If you run a benchmark a bunch of times between scavenges, the raw memory usage scales up, but the copying cost does not. This creates an illusion that the gc is working very well. In real long-running programs, the amount of live data *is* typically larger, sometimes dramatically larger, and the copying cost goes up significantly. A big hunk of the cost of a generational gc is incurred _during_ _program_execution_, not just at scavenges. (Stores into the heap have to be checked and/or recorded, to keep track of intergenerational pointers.) So if you just record pause times, you miss this component of gc cost entirely. For little programs with little live data, this is likely to be the major cost of generational garbage collection. (I estimate a minimum of about 3%, but probably more like 6%, and maybe more, for fast optimized code.) Again, you can get _very_optimistic_ estimates of gc performance if you ignore this cost. The right way to test is this to recompile the code, leaving out the checking/marking code emitted for generational gc, and see how much faster your programs run without that continual overhead. (But maybe Pop-11 doesn't have a generational gc. If not, I guess it doesn't have a lot of heap-resident system stuff like big editors, browsers and compilers; that would increase the pause times.) -- Paul Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680