Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!apple!sun-barr!decwrl!shelby!neon!carcoar!wilson From: wilson@carcoar.Stanford.EDU (Paul Wilson) Newsgroups: comp.lang.lisp Subject: Re: LISP compiler? (really Incremental & Opportunistic GCs) Keywords: GC, opportunistic, incremental, responsiveness, efficiency Message-ID: <1989Nov28.202758.5021@Neon.Stanford.EDU> Date: 28 Nov 89 20:27:58 GMT References: <5130@internal.Apple.COM> <5149@internal.Apple.COM> <616@pyuxf.UUCP> <2159@cs-spool.calgary.UUCP> <4@wglen.UUCP> <5145@wheat-chex.ai.mit.edu> Sender: USENET News System Reply-To: wilson@carcoar.Stanford.EDU (Paul Wilson) Organization: U. of Illinois at Chicago (UIC, *not* UofC or UIUC) Lines: 110 In article <5145@wheat-chex.ai.mit.edu> CarlManning@ai.mit.edu writes: >In article <4@wglen.UUCP> bonham@wglen.UUCP (Mike Bonham) writes: >If you're just dealing with human interfaces rather than real time >requirements, then incremental garbage collection isn't the only way >to improve perceived performance --- instead you can try to schedule the long >scavenges during otherwise unresponsive periods, e.g. during >compute-bound operations or during idle periods. For more on this idea, >see Section 4: Scavenge Scheduling, (p 29) of: > > Design of the Opportunistic Garbage Collector > Paul R. Wilson and Thomas G. Moher > OOPSLA'89 Proceedings, 1-6 October, 1989 > SIGPLAN Notices vol24 n10, p 23-35 > >(This paper also includes some other ideas for implementing >generational GC on stock hardware and some performance measurements.) > >(Just passing on a reference -- I have no experience using it.) > >CarlManning@ai.mit.edu I should probably point out that ours is not the first gc to improve perceived responsiveness by opportunistic scheduling. On the other hand, we use opportunism both to improve responsiveness *and* to actually increase efficiency. Jon L White has reminded me that the Interlisp gc used the strategy of preferentially scheduling gc work during long user pauses ("think time"). Also, the Symbolics systems make the background scavenger of their incremental gc work harder during pauses than during program execution. (Even though the system is incremental, performance can be noticeably degraded by the background scavenger, so it's better for it to do its job when the user doesn't care.) Our gc is different in that it schedules gc's preferentially when there is likely to be less work to do. It's a copying collector, so the work it does is proportional to the amount of *live* data. By scheduling scavenges between major computations, we not only hide pauses but *reduce* them, since there are usually fewer live objects (holding intermediate results of ongoing computations) then. This didn't happen with the Interlisp gc, because it wasn't a copy collector. It also doesn't happen with Symbolics gc because it is the scheduling of *flips* (root set scans) that primarily determines the amount of data seen as live by the gc. There are really several different issues here. One is efficiency, another is disruptiveness, and another is real-time response. Only a fine-grained incremental gc can guarantee short pauses and real-time response. A non-incremental generational gc can usually come close enough for most people's purposes. (If you're willing to do a little tuning, you may be able to guarantee acceptable pauses for a particular application by forcing scavenges of the right numbers of generations at the right times. Our system attempts to do this acceptably well automatically, but may fail.) Appel, Ellis, and Li's gc avoids the continual checking for forwarding pointers that a fine-grained incremental gc requires. I wouldn't quite call it real-time, though, because the unit of work is fairly large (copying the referents of all of the pointers in a page) and these units may come in bursts. It may not work as well for the youngest generation of a generational system, because a higher percentage of pages in younger generations is likely to be touched in a short period of time. So you still may need tuning or opportunism, and you might want to use a hybrid scheme. (For example, opportunistic generation scavenging for the youngest generation, and Appel-Ellis-Li style incremental scavenging of older generations.) For difficult real-time applications, you'll still need a fine-grained incremental gc with its pointer-checking overhead. Another issue our paper dealt with is the number of generations a generational gc should have. We maintain that it's at least 3. Our own numbers at this point are still rather preliminary, but we've gotten several anecdotal reports lately of real-world programs that a two-generations scheme falls down on. (That is, they have too much data with intermediate lifetimes that force full gc's too often.) Having more generations can have drawbacks too, but they seem less severe, and can usually be handled with a teeny bit of opportunistic scheduling. (Note: your mileage *will* vary. A two-generation scheme seems to be absolutely fine for most people's programs. It's just some troublemakers out there... :-). Also, our current view is that a mixed strategy is probably best for tracking intergenerational pointers. For most programs, Ungar's remembered set scheme incurs less overhead than our "card marking," but is occasionally bad for large, long-lived objects. (E.g., you may get 4-megabyte array in your remembered set and scan it entirely at every scavenge!). Probably the way to go is to segregate large objects into a separate "large object area" as Tektronix Smalltalk does, and to use card marking in this area (only) to avoid scanning the whole objects. -- Paul Paul R. "Garbage Man" Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu Box 4348 Chicago,IL 60680 Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@carcoar.stanford.edu Box 4348 Chicago,IL 60680