Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!lll-winken!unixhub!shelby!neon!craig From: craig@Neon.Stanford.EDU (Craig D. Chambers) Newsgroups: comp.object Subject: Re: Precise GC (was Re: Do we really need types in OOPL's?) Message-ID: <1990Oct17.001308.2003@Neon.Stanford.EDU> Date: 17 Oct 90 00:13:08 GMT References: <563@roo.UUCP> <1990Oct12.221032.20917@Neon.Stanford.EDU> <570@roo.UUCP> Organization: Stanford University Lines: 56 In article <570@roo.UUCP> boehm@parc.xerox.com (Hans Boehm) writes: >let > y = X >in > i := 0; > while i < 10000 do > i++; > z := y * y + g(i); > f(z); > endwhile > >I would certainly like my compiler to move "y * y" (or any such expression) >out of the loop. But that isn't allowed, since doing so would preserve >the temporary across the procedure call. If f ends up invoking the garbage >collector, I will fail to reclaim the storage used to hold y*y. Thus some >conservativism would have crept back into my collector. Yes, you are right about this aspect of the language being undefined (i.e. when temp expressions have their space reclaimed). All the language specifies is the minimum time that an expression must remain allocated, not when it is required to be reclaimed. All kinds of optimizations might lengthen the effective lifetime of heap storage (one of which you mention above). >Your factor of 2 presumably buys me all sorts of things, so I'm not quite >sure what to make of it. For most applications, I would be very unhappy >if I had to pay a factor of 2 solely for "precise" collection. It would >be nice to know how those costs broke down. I suspect that not being able >to reuse registers hurts a lot more on a '386 than a SPARC. Yes, you're right again, my numbers don't really say anything about the cost of preserving "dead" variables solely for the debugger. My intuition from examining the generated code is that in C-style benchmarks, all the message sends are eliminated (in the common case branches), and so there is virtually no run-time cost for this debugger model (since the only place that the debugger could look at the values of variables is at the interrupt point at the end of the loop, and most variables in question are out of scope by then). Note that I'm talking about the optimized code that contains no breakpoints and isn't being single-stepped through. To implement breakpoints and single-stepping, the compiler could generate another version of the code with more potential interrupt points, and this code might be a bit slower than the normal run-time code. But real measurements are needed to say anything for sure, and so I'll add that measurement to the performance chapter of my thesis (which I'm writing right now, BTW). I had intended to do the analysis anyway, but your posting will help make sure I get around to it. Hopefully the 68k version of the new compiler will be working soon, and we can see how much of an impact this debugger requirement has on a relatively register-starved machine. (Does a '386 have even fewer registers? I can configure the code generator to use only as many physical registers as I tell it to, so I could crudely simulate a machine with an arbitrarily small number of registers.) -- Craig