Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!pt.cs.cmu.edu!wb1.cs.cmu.edu!ram
From: ram@wb1.cs.cmu.edu (Rob MacLachlan)
Newsgroups: comp.sw.components
Subject: Re: Reasons for low reuse
Message-ID: <6176@pt.cs.cmu.edu>
Date: 18 Sep 89 14:22:11 GMT
References: <175@ntmtv.UUCP> <11701@boulder.Colorado.EDU>
Organization: Carnegie-Mellon University, CS/RI
Lines: 61

>From: hopper@ntmtv.UUCP (Ian Hopper)
>Newsgroups: comp.sw.components
>Subject: Re: Reasons for low reuse
>Date: 13 Sep 89 03:43:32 GMT
>
>I have always felt that some form of Garbage Collection is
>needed in order to do a decent job on reusable "Collection"
>modules.
[...]
>
>The classic approach would be to copy-in and copy-out the contents
>of such containers, but simply passing pointers to contents is
>often much faster.
>
>Do we focus our efforts on languages that have garbage collection,
>or are we forced to use the copy-in/copy-out approach?  Regrettably,
>the USING code is quite different in the various cases.

It seems to me that once you admit dynamic allocation into the language, one
must at least concede the \desirability/ of having GC.  The only question is
whether the cost of supporting GC is excessive.  There are two costs that I
see:
 1] A performance penalty due to GC time and tag manipulation.  Unless GC
    support is controlled by a compiler switch, some of this overhead is also
    present in programs that don't use GC.  GC can also cause unpredicable
    delays.
 2] At the language level, a certain discipline in pointer use is required
    so that the garbage collector can locate pointers and not be confused by
    random data that might be a pointer.  

Being a Lisp programmer, it is understandable that I don't find either of
these costs prohibitive.  The performance penalty isn't as much as people
suppose: a lot of progress has been made in GC algorithms in the past ten
years, and these algorithms have recently begun to see practical use in
languages such as Lisp, Smalltalk and ML.  Also, as you hinted at above,
there are often hidden efficiencies that can only be safely realized in
systems that support GC.  So far as the language style issue, this isn't a
big problem for high-level languages other than C.

I think that Lisp is a constructive proof of the reuse potential in systems
that support "generic types", GC and first-class functions (e.g. procedures
in records, etc.)  As many whining system managers will tell you, Lisp
systems are big.  That bigness is mainly because a Lisp image has lots of
code in it; Lisp programs tend to have big working sets because lots of
that code is used.  

Coming from a Lispy language perspective, the main challenge is not to make
the system smaller, but to make it bigger so that there is more stuff out
there that the programmer can use.  (Of course, we want to maintain similar
cost-effectiveness, rather than just bloating with junk.)

>PS: Writers of collection modules in GC-based environments should not
>forget to create dynamically-expanding versions of their collections. 
>Itis very annoying to have to estimate the size of collections in
>advance, things always blow up on large data sets.

Yep, this is often something that we have to flame new Lisp programmers for
when we are breaking them in.  Of course, if you use a linked-list
implementation of your data structures, this happens for free.

  Rob