Path: utzoo!attcan!uunet!samsung!sdd.hp.com!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew
From: aglew@basagran.csg.uiuc.edu (Andy Glew)
Newsgroups: comp.arch
Subject: Re: Processor architecture to support functional languages
Message-ID: <AGLEW.90Jun15165103@basagran.csg.uiuc.edu>
Date: 15 Jun 90 20:51:03 GMT
References: <5439@midway.cs.glasgow.ac.uk>
	<CARLTON.90Jun15162416@mingus.mitre.org>
Sender: usenet@ux1.cso.uiuc.edu (News)
Followup-To: comp.arch
Organization: University of Illinois, Computer Systems Group
Lines: 48
In-Reply-To: carlton@mitre.org's message of 15 Jun 90 20:24:16 GMT


>It seems to me that it would be easier and faster (or at least not
>slower) to let somebody else take care of making sure that your stack
>doesn't run out of space unless you actually run out of memory.  It
>would mean that you wouldn't have to write code which has to deal with
>the fact that the stack might move at a moment's notice (though, if
>the garbage collector were written properly, I suppose that there
>would be no way for the program to tell that the stack had moved),
>might reduce fragmentation, and so forth.
>
>...
>
>david carlton
>carlton@linus.mitre.org

Single threaded programs don't have to worry about stack overflow.
They put the heap at one end of virtual memory, and the stack at the
other, let them grow together, and take a page fault trap when they
grow past the currently allocated bounds.  And you usually run out of
physical memory and swap space before you run out of virtual address
space.

Why can't we use the same technique for multi-thread stacks?  Apart
from stupidities like OSes that only let you have three sets of
contiguously mapped pages, you can.  The growing toward each other
trick wouldn't help much, but you could conceivably set each threads'
stack in a larger than the stack will ever require virtual address
space, with only one page mapped in at the beginning, and then let
them grow. Page fault on the stack growing past the limit of already
allocated space, and page fault (interpreted differently) if a stacks'
top address ever gets close to the base of the next stack in memmory.

Why isn't this done?  Well, let's see - say I have 128 threads.  I'm
on a 32 bit machine, but say that only 1G of addresses are accessible
to the user. And say maybe that 512M of address space is needed for
non-stack related applications, like text, bss, the heap, and mapped
files (can't have too many mapped files :-( ).
    So, we have 512M for 128 stacks. 19-7=12 => 4K per stack.  Not
very much, is it? (In fact, it's smaller than the page size for high
performance machines).

And there you have it.  Existing virtual memory mechanisms are sufficient
hardware support for multiple stacks, except (1) there isn't enough
address space, and (2) large page sizes imply large overhead.  Eg. on a system
with 16K pages, do you really want to allocate a page to each threads' stack?
If most of the memory won't be used (but has to be there for worst case).
--
Andy Glew, aglew@uiuc.edu