Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!stanford.edu!snorkelwacker.mit.edu!bloom-beacon!dont-send-mail-to-path-lines From: gyro@cymbal.reasoning.COM (Scott Layson Burson) Newsgroups: comp.lang.scheme Subject: stack/heap continuation allocation Message-ID: <9105020729.AA03056@cymbal.reasoning.com.> Date: 2 May 91 07:29:59 GMT Sender: daemon@athena.mit.edu (Mr Background) Reply-To: Gyro@reasoning.com Organization: The Internet Lines: 81 I've had an interesting idea come together over the last few days. I was thinking about implementations of firstclass continuations. The ones I'm familiar with work basically as follows: to reify a continuation, they allocate a chunk of heap and copy the stack into it; to invoke it, they copy it back into the stack. I was reflecting that that seemed like a lot of overhead, and noting that if one could heap-allocate activation frames, continuation reification and invocation per se would be extremely cheap, but of course heap- allocating frames is extremely expensive in terms of GC, and wouldn't it be nice to have the best of both worlds? The efficiency of a stack for the normal case, that is, with the efficiency of a heap for continuations. And while I haven't worked out all the details, I think I've come up with a way to do that. The fundamental observation is quite simple. Roughly stated (I'm aware that I'm skipping over details and variations in the way this is done): in a normal stack-oriented language implementation, there are two registers, the stack pointer (SP) and the frame pointer (FP). Allocation of an activation frame usually looks something like push FP move SP into FP and deallocation looks like pop FP move FP into SP Thus FP "tracks" SP in the sense that the difference SP - FP is the number of words that have been pushed since the last frame allocation. [For expository purposes I take the point of view that the stack grows "up", toward "higher" addresses, though of course inverted stacks are common.] It occurred to me that this tracking is not strictly necessary; it is possible to use FP for references to the current activation frame and SP for allocation of the next activation frame, and have the two be potentially independent. Specifically: we can denote a register to be the "stack bottom" register SB, such that we allow FP < SB but require always that SP >= SB. Then, basically, what CALL/CC does to reify the current continuation is to set SB to SP, conceptually transferring the frames that had been between SB and SP from the stack into the heap (but this transfer is conceptual only, and requires no copying). Then FP tracks SP as usual *except* when execution is inside a captured continuation. The only overhead this adds to normal operation is to increase the amount of work required to deallocate a frame: load FP[offset] into FP ; known at compile time compare SP with SB + 1 ; `+ 1' covers initial `push FP' branch if equal to L move FP into SP L: Simple, yes? This probably isn't the right thing for creating continuations in a preemptive multithreading system. The problem is that on every thread switch, all the frames "recently" pushed on the stack get transferred into the heap. A multithreading system can and should take advantage of the fact that each continuation it creates will be invoked at most once, by switching entire stacks. But for a program that really does make use of firstclass continuations in a big way, this is as efficient an approach as I can imagine. Of course the garbage collector has to be written to handle the structures that result from doing this. (Another caveat: I am not sure how this interacts with some of the cleverer ways of using the stack that Orbit (the T compiler), for instance, is capable of.) If this pans out -- and I haven't found anything wrong with it yet -- it definitely qualifies as my best idea in months. I don't have time to turn it into a paper -- if anyone out there would like to do so, with me as co-author, I would be delighted! -- Scott