Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!gem.mps.ohio-state.edu!ginosko!aplcen!haven!adm!cmcl2!lanl!opus!ted From: ted@nmsu.edu (Ted Dunning) Newsgroups: comp.sw.components Subject: Re: Real-time Garbage Collection Message-ID: Date: 18 Sep 89 15:19:31 GMT References: <91482@ti-csl.csc.ti.com> <6488@hubcap.clemson.edu> Sender: news@nmsu.edu Organization: NMSU Computer Science Lines: 94 In-reply-to: billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu's message of 18 Sep 89 11:21:36 GMT i find that i am mostly in agreement with mr. wolfe when In article <6488@hubcap.clemson.edu> billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe, 2847 ) writes: The point is, any system which uses GC is going to be considerably less efficient (and constant factors are important here!) than a competing system in which the fact that an object is no longer needed is directly communicated to the storage management system. this is not always true. if the objects are heavily shared, or if point in time that the object becomes unreferencable is obscure, then this is a difficult problem. people have used reference counting in the past to solve this problem, but the general consensus has always been (on lisp implementations) that devoting enough bits to reference counting was impractical (it certainly is if every value with pointer semantics must be tagged) and that reference counting collection was more expensive than stop and copy. in addition, reference counting does not compact storage as does any of the various forms of stop and copy. All that is necessary is to encapsulate the details of storage management within abstract data types, such that the destruction algorithm is integrated into the block exit mechanism. this is not sufficient. we also need to handle assignment (the old value of the assignee becomes less referenced), and formal argument passing (where we suddenly have one more reference). in c++ there is also all of the complexity of implicit and explicity type conversion. trickier is the issue of continuations and first class functional objects. somehow all variables that support reference counting styles of storage management which are referenced by a continuation, or by a functional object, must be marked as referenced and not reclaimed on block exit. By doing this, we place a level of abstraction between the application programmer and the details of storage management. certainly the goal. The use of pointers is confined to the ADT's implementation, and the application programmer is shielded from all the resulting complexities. again, motherhood and apple pie. The interface also simplifies the work of the ADT's implementor; the destruction algorithm is usually quite straightforward. this is only straightforward in a language that does not support advanced concepts. (like ada and c++). once you talk about a language as strong as scheme, then you have a much more interesting problem. one that is, incidently, easily handled by conventional garbage collectors. in fact, reference counting is not the only possibility in a language like c++ which provides some knowledge of scope entry and exit for the programmer. instead, it is possible to keep the equivalent of an obarray, pushing and popping references to variables as needed. then when gc is desired, one can do a stop and copy or mark/sweep with an extra pass to correct all the pointers. Worrying about GC is essentially barking up the wrong tree; what we need is better reuse mechanisms, so that we can also gain the added benefit of exploiting the implicitly supplied information regarding the fact that a given variable is no longer needed which is naturally produced (with no special effort on the part of the application programmer) upon each and every block exit. this _is_ garbage collection. it just isn't stop and do x garbage collection. it should be kept in mind that in general, collection overhead increases as you go from generational collection to stop and copy to mark sweep to reference counting to `real time' collection methods (almost all of which are based indirectly on baker's algorithm in the august '78 issue of cacm). if you care about average throughput, use stop and copy, and maybe a generational copier. if you want a very simple system, use ref counting. nobody really uses the real time methods yet because they are expensive and complex. -- ted@nmsu.edu Most of all, he loved the fall when the cottonwoods leaves turned gold and floated down the trout streams under the clear blue windswept skies.