Path: utzoo!attcan!uunet!mcsun!hp4nl!swi.psy.uva.nl!jan
From: jan@swi.psy.uva.nl (Jan Wielemaker)
Newsgroups: comp.lang.prolog
Subject: Re: GC triggering and stack limit checking by MMU hardware
Keywords: GC, stack, heap, MMU
Message-ID: <3261@swi.swi.psy.uva.nl>
Date: 23 Jul 90 20:12:34 GMT
References: <1990Jul19.151524.22544@diku.dk> <3260@swi.swi.psy.uva.nl> <11079@alice.UUCP>
Reply-To: jan@swi.psy.uva.nl (Jan Wielemaker)
Organization: SWI, UvA, Amsterdam
Lines: 57

Computers and operating systems are (should be) designed to make life
easier for the user and programmer.  Fernando Pereira claims that the
overhead of stack shifting and stack overflow checks is not very large,
that this schema is much better portable and that it is less critical
to OS errors.  All these arguments are correct.  There is another reason
for which I don't like them.  I do not open /dev/rxy0c to read and write
files because I do not trust the OS disk cache and file system handling.
Instead, I tend to use open(), read(), write() and close().  This problem
is a bit similar.  Using the MMU to handle the stacks, I get the
following advantages:

	- I do not have to think where to do stack overflow checks or
	  how to do them cleverly.  [Question: how do you know how
	  much will be written on the trail before the next point
	  where you can savely do a stack shift?]
	- I do not have to write a stack shifter, nor think about when
	  to call this thing and how much to grow the stacks.  For any
	  such parameters there always are programs for which they have
	  the wrong value.
	- If stack shifts were the only thing I wanted, I would not be
	  considered keeping track of references to the stacks (from
	  the virtual machine, foreign language code, etc.) as the stack
	  does not move.  [Unfortunately I need to keep track of all
	  these things for the garbage collector].
	- If I use malloc()/free() or a similar package for handling the
	  other data (notably the program), I will need to write my own
	  versions of them because the stacks need to be kept together
	  to avoid large unused chunks of memory. [Unluckily a dedicated
	  perfect fit algorith performs much better than malloc()/free()
	  of most OS's, so I wrote a dedicated memory alocation system
	  anyway, but I think the argument nevertheless holds.  Also,
	  my dedicated algorithm does nice on Prolog, but might not very
	  well suit foreign language code.  This code now still can use
	  malloc()/free() without conflicting with Prolog.]
	- Still, it IS faster (provided it works).

Having access to the MMU and the possibility to return from the SEGV
handler I save all the research and programming effort for this.  On
SunOs (from version 4.0.1) this works nice for SUN-3 and SPARC.  It also
works on GOULD PN9000 under UTX 2.1.  Probably there are more systems
that offer this.  Besides all this I even gain something concrete on
these systems: I really can give memory resources back to the system, so
after a query that used 4 MB of stack I will not claim these 4 MB for
the rest of the session.  I use this after calling the garbage collector
and after finishing a user query (there is a Prolog predicate for
deallocating everything above the used part of the stack).  I consider
to use it after deep backtracking if the used part of the stack is far
below the allocated part.

To me it is clear that access to the MMU simplifies the task of the
implementor of notably multi-stack languages.  The only real question
is: can an OS/hardware *in principle* do sparse addressing as efficient
as handling one continuous address space?  If the answer to this
question is negative there is a tradeof.  If the answer is positive any
decent OS should provide these facilities.

	Jan