Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!rex!ames!decwrl!shelby!neon!carcoar!wilson
From: wilson@carcoar.Stanford.EDU (Paul Wilson)
Newsgroups: comp.lang.lisp
Subject: Re: Garbage collection algorithms
Summary: reorganizing objects/pages/blocks dynamically to improve locality
Message-ID: <1990Feb2.095755.9760@Neon.Stanford.EDU>
Date: 2 Feb 90 09:57:55 GMT
References: <366@argosy.UUCP> <1990Jan22.111348.5585@Neon.Stanford.EDU> <BDS.90Jan22123550@mbunix.mitre.org> <108553@ti-csl.csc.ti.com>
Sender: news@Neon.Stanford.EDU (USENET News System)
Reply-To: wilson@carcoar.Stanford.EDU (Paul Wilson)
Organization: U. of Illinois at Chicago (UIC, *not* UofC or UIUC)
Lines: 129

In article <108553@ti-csl.csc.ti.com> djohnson@m2.csc.ti.com (Doug Johnson) writes:
>In article <BDS.90Jan22123550@mbunix.mitre.org> bds@mbunix.mitre.org (Smith) writes:
>>
>>I have a paper written by Bob Courts of TI called "Improving Locality
>>of Reference in a Garbage-Collecting Memory Management System" that
>>describes how TI implemented gc on the Explorer.  I found it to be an
>>interesting paper, and well worth reading.  It was given to me by a TI
>>salesman, and I don't know if it was ever published.  
>
>It was published in Communications of the ACM, September 1988.  While
>there is some discussion of GC in general, it focuses on GC in a
>virtual memory environment.  I believe the tradeoffs are very
>different in a physical memory system.  Paul Wilson's suggestions are
>close to what I'd try.
>
>-- Doug

In article <108553@ti-csl.csc.ti.com> djohnson@m2.csc.ti.com (Doug Johnson) writes:
>In article <BDS.90Jan22123550@mbunix.mitre.org> bds@mbunix.mitre.org (Smith) writes:
>>
>>I have a paper written by Bob Courts of TI called "Improving Locality
>>of Reference in a Garbage-Collecting Memory Management System" that
>>describes how TI implemented gc on the Explorer.  
>  [...]
>It was published in Communications of the ACM, September 1988.  While
>there is some discussion of GC in general, it focuses on GC in a
>virtual memory environment.  I believe the tradeoffs are very
>different in a physical memory system.  Paul Wilson's suggestions are
>close to what I'd try.
>-- Doug

If you're interested in neat things to do to improve locality in interesting
hierarchical memories (as opposed to a pc), there are actually several
similar things worth investigating -- all of them apparently derived
independently by various folks.

Courts' system is actually strikingly similar to the system proposed in
Jon L White's paper for the 1980 Lisp Conference, "Address/Memory
Management for a Gigantic LISP Environment".  The basic idea of both
is to use Baker-style incremental copying to improve locality of
reference when garbage collection isn't sufficient to do the job.  In
the case of White's proposed system, that's because there's no
garbage collection at all(!) most of the time.  In Courts' system,
the same technique is used in a generational garbage collector.

This also bears a strong resemblance to an idea proposed in (and simulated)
in a paper by Jean-Loup Baer and Gary R. Sager ("Dynamic Improvement
of Locality in Virtual Memory Systems", IEEE TSWE, March 1976), way
back in the mid-70's.  Their idea is that the unit of virtual memory
mapping ("minipages") be smaller than the unit of disk transfer,
and that minipages be grouped according to dynamic reference patterns
within disk pages.  This effects a dynamic reorganization of minipages
within disk pages.  The four least-recently-used minipages are grouped
together and ejected on a miss, and when a page is brought in, so are
the other 3 minpages on its disk page.  The result is to effectively 
prefetch related minipages on a page miss.

Unfortunately, Baer & Sager's technique failed to give an improvement
in performance, but I think that's because they didn't give
it a fair test.  I believe the failure of their system to do the trick
was because of the particular characteristics of computers in 1975.
If you tried it now, on a modern computer, it would probably work pretty
well.

It's my belief that one of the important prerequisites for effective
prepaging is simply that you have a lot of memory pages.
That is, if devoting a page to prefetching sacrifices more than about
1% of your memory, then you lose -- the increased misses caused by
the smaller memory will not be compensated by successful prefetches.  On
the other hand, if you have more than about 100 pages, devoting one to
holding prefetches is worthwhile -- it doesn't cost you many misses,
and it will prefetch a useful page often enough.


So it seems that if you have more than about 100 units of memory (pages in your
main memory, or cache blocks in your cache), then prefetching is worth it.

If you have significantly fewer pages/blocks, you're better off devoting them
all to normal caching.  If you have significantly more, you're better off
spending a few on prefetching.

Simple, no?  Am I right?  I don't know.  I hope to find out soon.
But it is very interesting to note that computers these days have many
more units (pages/blocks) at a level of storage than they did in 1975.
This is necessitated by the much higher ratio of CPU speeds to RAM speeds,
and RAM speeds to disk speeds -- computers these days *must* have big
memories and caches, and higher hit rates, or they just won't cut it.

So the marginal cost of using a block for prefetching is smaller now, while
the relative benefit is about the same.  This says to me that prefetching
is an increasingly good idea, every day.


Anyway, to get back to our story, Baer & Sager's technique appears to
have failed because they had to effectively reserve 3 "minipages" for
prefetching to be able to reorganize 4 minipages at a time within the
disk-transfer pages.  (They maybe also didn't reorganize them the
way they should have, though in fact there may be a better order
that's much cheaper to implement.)  Since 3 minipages was a couple
of percent of even the largest memory they tried, it was a lose.


It turns out that people at the U. of Manchester have come up with
a simliar system, reorganizing cache blocks within disk pages.  This
is in a hardware-supported object-oriented memory for Smalltalk.  The
effects seem similar to those of Courts' reorganization during gc's.
(see "Dynamic Grouping in an Object-Oriented Virtual Memory Hierarchy,"
Proc ECOOP '87, pp. 87-96, and "Realisation of a Dynamically Grouped
Object-Oriented Virtual Memory Hierarchy," I.W. Williams, M.I. Wolczko,
and T.W. Hopkins, in Proc. of the Workshop on Persistent Object Systems: 
Their Design, Implementation, and Use, pp 298-308, August 1987.  Persistent
Programming Research Report, Edinburgh University, Dept. of Computer Science
(PPRR-44-87).

Naturally, I found out about this because I reinvented the same wheel
and eventually somebody told me so :-)

  -- Paul


Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680 

Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@carcoar.stanford.edu
Box 4348   Chicago,IL 60680