Path: utzoo!attcan!uunet!cs.utexas.edu!sun-barr!newstop!exodus!rbbb.Eng.Sun.COM!chased
From: chased@rbbb.Eng.Sun.COM (David Chase)
Newsgroups: comp.object
Subject: Re: C++ and garbage collection
Message-ID: <1044@exodus.Eng.Sun.COM>
Date: 28 Sep 90 19:32:29 GMT
References: <966@exodus.Eng.Sun.COM> <PCG.90Sep27200339@odin.cs.aber.ac.uk>
Sender: news@exodus.Eng.Sun.COM
Organization: Sun Microsystems, Mt. View, Ca.
Lines: 78

In article <PCG.90Sep27200339@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>On 26 Sep 90 21:45:14 GMT, chased@rbbb.Eng.Sun.COM (David Chase) said:
>
>chased> One thing I haven't noticed yet is mention of transformations
>chased> applied to pointers by an optimizing compiler.
>
>Point one: I do not like optimizing compilers :->. Point two:
>conservative g.c. *may* require compiler cooperation.
...
> Let's switch the burden of proof: let the proponents of manual
>storage reclamation ... prove that *their* alternative overall
>involves less overheads than even 'volatilized' conservative g.c.,
>in a *system wide* comaprison (speed of compiled code + overhead of
>storage reclamation).

I agree that conservative GC may require compiler cooperation, but in
practice anyone using a translator to C is not getting it.  I also
agree that time spent in the garbage collector may be paid back in
other ways (faster allocation, increased locality of reference).
These effects can be significant; Edelson shows that allocation (that
is, allocation and initialization in the building of an expression
tree) using his compacting collector is over twice as fast as
allocation using the system-supplied malloc, and over three times
faster than allocation using reference-counting reclamation.

However, the advantage is not so clear when the time for garbage
collection is added; it varies from twice as fast (all objects
discarded) to slightly slower (no objects discarded).  The
conventional wisdom for Lisp and Smalltalk is that most objects are
discarded, implying that a figure of "twice as fast" ought to hold.

However
(1) Conventional wisdom for Lisp/Smalltalk may not hold for C/C++;
    Lisp and Smalltalk rely on garbage collection for activation
    records and real numbers, for instance, and C does not
    (optimization of Lisp cleans this up somewhat).

(2) Edelson's benchmark compares the *standard* unix allocator with
    his garbage collected allocator.  It is (unfortunately) fairly
    standard practice to interpose user-managed storage caches for
    high-traffic storage sizes, and thus run-time expense of manual
    storage reclamation is easily avoided without turning off the 
    optimizer, and without pauses for garbage collection.

(3) On one benchmark of C (*) that is already heavily hand-optimized,
    and making heavy use of memory allocation (allocating 3-10 meg),
    unoptimized code takes 40% longer than unoptimized code.  I expect
    that the costs will only get larger in the future; new machines
    are not being designed with unoptimized code in mind.

(4) I've used Hans Boehm's collector in place of malloc with this very
    same benchmark, compiled with full optimization, in conservative
    (not hyper-conservative) mode.  It ran 10% slower.

(5) Profiling indicates that memory management takes less than 3% of
    the time; thus, reducing its cost to zero would not offset the
    additional cost of running unoptimized code.

Now, I've said nothing about time wasted by the programmer in writing
the manual storage management code, and nothing about the likelihood
of bugs in the two systems.  Those two costs put me strongly in favor
of garbage collection.

However, those two costs also put me strongly in favor of compiler
optimizations -- as a rule, the compiler does a faster, more correct,
more effective, and more reliable job of micro-optimization than I do,
and though I prefer garbage collection, I do not like the cost of
turning off the optimizer, and I think many people would like it even
less than I do.

David Chase
Sun Microsystems

(*) The benchmark is a bottom-up tree pattern matcher generator.  You
can read about the algorithmic improvements in the 1987 POPL
proceedings.  The code has been heavily hand-optimized across Vax
11/750, Sun 3, and IBM RT.  I can send the code to anyone who cares; I
wrote it as a student.