Path: utzoo!attcan!utgpu!watmath!att!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!wuarchive!udel!gatech!hubcap!billwolf%hazel.cs.clemson.edu
From: billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu (William Thomas Wolfe, 2847 )
Newsgroups: comp.sw.components
Subject: Re: Garbage Collection & ADTs
Message-ID: <6530@hubcap.clemson.edu>
Date: 21 Sep 89 16:08:26 GMT
References: <900@scaup.cl.cam.ac.uk>
Sender: news@hubcap.clemson.edu
Reply-To: billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu
Lines: 107

From scc@cl.cam.ac.uk (Stephen Crawley):
>> This ADT will do dynamic allocation INTERNALLY so as to facilitate
>> its own expansion and contraction as necessary.  Upon block exit, 
>> the ADT's destroy procedure will be invoked, and all the space
>> occupied by the ADT will be released.
>
> OK ... in the trivial case your scheme does work.  The conditions
> for it to work include:
> 
> a)	NO references to the ADT are passed out of scope.

   In keeping with the general principle that application programmers
   should not be using pointers; appropriate ADTs should be used instead. 

> b)	the scope exits before you run short of space.

   Or you have programmed your application to properly handle the
   exception STORAGE_ERROR.  The sequence is then as follows:

    1) An ADT request comes in for which the ADT decides it will
       need more space.  The ADT attempts to obtain the space, but
       STORAGE_ERROR is raised.  Fortunately, the ADT is programmed
       so as to guarantee to the user that the ADT will remain in
       its original state (i.e., the transaction will abort).  

    2) The ADT now either directly propagates STORAGE_ERROR or 
       something equivalent (Unable_To_Comply, for example) to 
       the user, in accordance with the ADT's specification.  

    3) The user can then take whatever steps are deemed appropriate
       in order to handle the exception.  Note carefully that only 
       the user is in a position to decide what data structures 
       to throw away in such a crisis.  

> c)	the programmer (i.e. the application!) does not incorrectly tell 
> 	the ADT to release an instance too soon.

   Since the default is that the ADT will die automatically upon going
   out of scope, the user will presumably refrain from invoking Destroy
   without a fairly good reason.

>   o  GC'ed memory management can reclaim junk (heap nodes that are no longer
>      needed) whenever it runs out of space, whereas a scope based scheme
>      can in general only identify and hence reclaim junk on scope exit.
>      In other words, scope based reclamation of memory will in general 
>      need more memory.

   Not true.  Consider -- what precisely is junk?  If we define it as
   "memory that is no longer needed due to deallocate operations", then
   the ADT will automatically free up all such memory by virtue of the
   semantics of the deallocation process; an order to deallocate 
   constitutes a guarantee from the ADT that it is completely safe 
   to do so; it knows this because it has received an order from the
   user (e.g., Remove_Item) which removes the need for the space, 
   and since the ADT is in total control of the space involved, it can
   proceed to issue the deallocation order and thereby reduce the size
   of the ADT's representation.  
 
   If we define junk as "memory which is strictly not necessary at this 
   point in the program", then we require a "Read Programmer's Mind" 
   instruction.  Not happening to have such an instruction, we must rely
   on the programmer to communicate the fact that a data structure is 
   no longer necessary via the Destroy command.

   Incidentally, the ADT can provide as one of its services a function
   which indicates the amount of space currently occupied by the ADT, 
   which the user can make use of when it becomes necessary to decide 
   which data structures to throw away.  This is made possible by the
   'SIZE attribute in Ada; the ADT simply adds up the 'SIZE of all the
   space it is occupying, and returns that value to the user.  The user
   can then simply select the data structure for which the cost function 
   (Size/Value) is maximal and order it Destroyed, and then proceed to 
   retry the aborted operation.
 
> Finally, don't forget that the case where condition a) is not true; i.e.
> some of the dynamically allocated ADT instance are passed out of scope
> either explicitly (as a result) or implicitly (by side-effect).

   Which can't happen unless either the applications programmer is 
   using pointers directly (naughty naughty), or something called by the
   application programmer (in effect, an extension of the application
   programmer) is doing it on his/her behalf.  In the latter case, we
   this presumably is in accordance with the called unit's specification, 
   and hence remains the fault of the applications programmer.

   The rule is simple: Leave the use of pointers to ADT implementors,
   and simply reuse the code they produce.  Exploit to the absolute 
   fullest the hard guarantees (proper space management, serializability, 
   recoverability, concurrent processing, etc.) that the ADT implementors
   worked so hard to provide you with, and thy Manager will be pleased.
   
> This whole GC versus non-GC debate seems to hinge the following trade-off.
> Do we want programs that are slower (by say 25%), but are intrinsicly 
> free of a large class of storage management errors, or are usually faster
> (not always), but may use more storage, may have potential storage leaks 
> and other storage management bugs, and which arguably take longer to 
> develop.

    All of the negatives you cite (except "may use more storage", which 
    is bogus) are overcome by the reuse paradigm.  The ADT is developed
    with extremely high quality at a high cost, but this cost is then 
    spread over an extremely large number of users until the end of time;
    the result is very much an economic win-win situation.


    Bill Wolfe, wtwolfe@hubcap.clemson.edu