Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!ames!ucbcad!ucbvax!hplabs!hplabsc!bs From: bs@hplabsc.UUCP (Bob Shaw) Newsgroups: comp.lang.lisp Subject: Re: Garbage collection question Message-ID: <1912@hplabsc.UUCP> Date: Thu, 28-May-87 23:24:10 EDT Article-I.D.: hplabsc.1912 Posted: Thu May 28 23:24:10 1987 Date-Received: Sat, 30-May-87 09:56:30 EDT References: <4631@columbia.UUCP> <1282@tekchips.TEK.COM> Distribution: world Organization: Hewlett-Packard Laboratories Lines: 69 Summary: Real-time? ... and a pointer to a cheap generation scheme Generation collectors can be more "real-time" than real-time collectors. In discussing garbage collection techniques, a point has been raised which I'm quite curious about. The term "real-time" has been used in all of the articles I've seen in the discussion. What is the nature of the "real-time" response that people are seeking? Is it simply a matter of not wanting a several second (minute) delay in response while editting a program, or are we talking about the needs for smooth 30 frame a second animation, or is it motivated by the need to respond to asynchronous events within microseconds of the event? Also, is this "real-time" behaviour sustained (e.g. as in a heart arrhythmia monitor) or bursty (e.g. as in gathering data from a two second car barrier crash)? Depending on the definition of "real-time" used, generation schemes can easily match or exceed the performance of the real-time collectors (e.g. especially in bursty situations). In Baker's paper (Comm of the ACM April 1978) he states when talking about adding vectors and arrays to his model, "However, the method can no longer claim to be real-time because neither the time taken by the array allocation (ARRAY-CONS) nor the time taken by the array element accessing function is bounded by a constant." In practice, it is possible to bound these operations by placing a limit on the size of allowable structures. Later, in the same paper, Baker comments, "Our scheme is still real-time on a virtual memory computer, but the bounds on the elementary list operations now have the order of magnitude of secondary storage operations." Thus, given systems with vectors and virtual memory, the bound on delays is now in the range of tens of milliseconds (for common secondary storage technology). In a 1984 Lisp Conference paper Dave Moon described some users reactions to the original incremental (real-time) collector implemented on the 3600's: "They found that garbage collection made interactive response so poor and unpredictable, and its overhead was so high, that it was preferable to turn off garbage collection and simply wait for the inevitable exhaustion of virtual address space (at which point the system crashes and must be re-booted)." According to Dave Ungar's dissertation (Berkeley Tech Report UCB/CSD 86/287), on a 68010 class machine running Berkeley Smalltalk, the mean time between Generation Scavenging collections was 16 seconds and the pauses due to collections ranged from 90 to 330 milliseconds with a mean of 160 milliseconds. In all fairness, his algorithm is designed to minimize VM paging and these numbers do not include the initial paging necessary to bring the pages in the first time. Nonetheless, most people would have a tough time spotting a .16 second delay during an editting session. I'd like to propose that "real-time" should be replaced by "incremental" in describing the algorithms since the bounding values can be large and difficult to establish. For many situations, the performance of Generation schemes is such that a user would not be able to determine whether a Generation or "real-time" algorithm was in use. Do people really desire guaranteed time bounds on garbage collection? Over and out, Bob Shaw HP Labs bs@hplabs.hp.com P.S. For those interested, I've written a tech report describing a Generation scheme that runs on stock hardware with no generation tags and essentially no runtime slowdown. It uses a simple heap layout and information from the virtual memory system. In the worst case, it degenerates into a conventional stop and copy algorithm. Empirical data about Lisp data lifetimes (Boy, are they short!) is also included. It is available as Stanford Technical Report CSL-TR-87-323 and entitled, "Improving Garbage Collector Performance in Virtual Memory."