Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!uakari.primate.wisc.edu!dali.cs.montana.edu!milton!uw-beaver!zephyr.ens.tek.com!tekchips!tekgvs!toma
From: toma@tekgvs.LABS.TEK.COM (Tom Almy)
Newsgroups: comp.lang.lisp.x
Subject: Re: xlisp 2.1/winterp internals (26K long)
Message-ID: <8440@tekgvs.LABS.TEK.COM>
Date: 16 Nov 90 21:13:29 GMT
References: <JSP.90Nov15145624@glia.u.washington.edu>
Reply-To: toma@tekgvs.LABS.TEK.COM (Tom Almy)
Distribution: comp
Organization: Tektronix, Inc., Beaverton,  OR.
Lines: 206

>I've just finished reading the xlisp 2.1 source code for the first
>time.  The tutorial and reference material included with the winterp
>distribution are well done, but I would have liked an overview of the
>interpreter internals.  Here's a first cut at such a document.
>Comments welcome...

I have spend many hours going over the listings, fixing bugs, and making
extensions. I wish I had this when I started. But I do have a few comments.


>xlenv and xlfenf are conceptually a single environment, although they
>are implemented separately. [...]

>The xlfenv environment is maintained strictly parallel to xlenv, but
>is used to find function values instead of variable values.  The
>separation may be partly for lookup speed and partly for historical
>reasons.

They have to be maintained separately because let lexically binds values and
flet, labels, and macrolet lexically bind only functions. 
For instance consider:
(defun x () x)
(setq x 10)
(let ((x 3)) (print x) (print (x)))

will print 3 and 10.

while

(flet ((x () (+ 1 x))) (print x) (print (x)))

will print 10 and 11.

and 

(let ((x 3)) (flet ((x () (+ 1 x))) (print x) (print (x))))

will print 3 and 4.

You couldn't do this with a combined binding list.


>The xldenv environment tracks the old values of global variables which
>we have changed but intend to restore later to their original values,
>particularly when we bind and unbind s_evalhook and s_applyhook
>(*EVALHOOK* and *APPLYHOOK*).  (This is mostly to support the debug
>facilities.)  It is a simple list of sym-val pairs,
>treated as a stack.

xldenv tracks the dynamic binding (as opposed to lexical binding). A "flaw"
in xlisp is that there is no mechanism for declaring special variables
(which would be always dynamically bound). You can dynamically bind
variables with PROGV. If my memory serves, only PROGV, EVALHOOK and 
(as I implemented it) APPLYHOOK dynamically bind variables.  For instance,
consider the following variation of the LET example above:

(defun x () x)
(setq x 10)
(progv '(x) '(3) (print x) (print (x)))

will print 3 and 3. (When execution falls out of progv, the global x is
rebound to 10).


This is the best way to override global variable settings in an application,
since the variables will be restored automatically on termination.


>Obviously, several of the above types won't fit in a fixed-size
>two-slot node.  The escape is to have them malloc() some memory
>and have one of the slots point to it -- VECTOR is the archetype.  For
>example, see xldmem.c:newvector().  To some extent, this malloc()
>hack simply exports the memory- fragmentation problem to the C
>malloc()/free() routines.  However, it helps keep xlisp simple, and it
>has the happy side-effect of unpinning the body of the vector, so that
>vectors can easily be expanded and contracted.

XSCHEME which relies more heavily on arrays, maintains a pool of storage
to allocate arrays and strings, for which it does garbage collection
and (I believe) compaction as well. At any rate, my modified xlisp can
optionally use the xcheme approach which has decided advantages in
programs that use many arrays and strings since the memory does not
get fragmented. Enough said.


>Xlisp pre-allocates nodes for all ascii characters, and for small
>integers.  These nodes are never garbage-collected.

This also speeds up READ, and vastly reduces the number of nodes since
all identical characters and small integers are unique. The range of
small integers treated in this way is compilation settable.


>As a practical matter, allocating all nodes in a single array is not
>very sensible.  Instead, nodes are allocated as needed, in segments of
>one or two thousand nodes, and the segments linked by a pointer chain
>rooted at xldmem.c:segs.

The size of the segment is settable using the ALLOC function.

>You create a symbol in xlisp by using the
>single-quote operator: "'name", or by calling "(gensym)", or
>indirectly in various ways.

I would say that 'name is an indirect way to create a symbol. The direct
ways are using MAKE-SYMBOL (for uninterned symbols) or INTERN (for interned
symbols), or as you mentioned GENSYM (also uninterned). You can make READ
create an uninterned symbol by preceeding it with #:, otherwise all symbols
read by READ are interned.

In addition, when you make a symbol that starts with the colon character,
the symbol is given itself as the value, otherwise the new symbol has no
value.


>OBJECT is the root of the class hierarchy: everything you can send a
>message to is of type OBJECT.  (Vectors, chars, integers and so forth
>stand outside the object hierarchy -- you can't send messages to them.
>I'm not sure why Dave did it this way.)

Probably because the object facility is an extension of lisp. You can
create classes of these things. There is also efficiency considerations.
The only object oriented programming language I know of where everything
is an object is Smalltalk, but if you look at the implementation, it does
cheat at the low level to speed things up.

> :isnew -- Does nothing

It does return the object!


>FSUBR: A special primitive fn coded in C, which (like IF) wants its
>arguments unevaluated.  

These are the "special forms"

>We scan the MESSAGES list in the CLASS object of the recipient,
>looking for a (message-symbol method) pair that matches our message
>symbol.  If necessary, we scan the MESSAGES lists of the recipients
>superclasses too.  (xlobj.c:sendmsg().)  Once we find it, we basically
>do a normal function evaluation. (xlobjl.c:evmethod().)  Two oddities:
>We need to replace the message-symbol by the recipient on the argument
>stack to make things look normal, and we need to push an 'object'
>stack entry on the xlenv environment so we remember which class is
>handling the message.


The first "oddity" has an important side effect, when :answer was
used to build the method closure, an additional argument, "self", was
added so that the method could access itself with the symbol self.
This argument stack fix supplies the needed argument. 

The reason for the second "oddity" is that the method's class is
needed for SEND-SUPER. When one uses SEND-SUPER, the message lookup
begins in the superclass of the method rather than the class of the
object (as with SEND).

>    xlstkcheck(3);    /* Make sure following xlsave */
>                      /* calls won't overrun stack. */
>    xlsave(list_ptr); /* Use xlsave1() if you don't */
>    xlsave(float_ptr);/* do an xlstkcheck().        */
>    xlsave(int_ptr);

xlsave also set the variable to nil. If you don't need to do that you
can use xlprot instead of xlsave, or xlprot1 instead of xlsave1

>xlapply, xlevform and sendmsg will issue an error if they encounter a
>s_macro CLOSURE.  This is presumably because all macros are expanded
>by xleval.c:xlclose when it builds a closure.

You are not allowed to use APPLY or FUNCALL with macros in Common
Lisp. There is no way provided to declare macro methods, nor do they
make much sense (at least in my mind).

>Neither xlapply nor sendmsg will handle FSUBRs.  This is presumably
>a minor bug, left due to the difficulty of keeping arguments
>unevaluated to that point. ?

You are not allowed to use APPLY or FUNCALL with special forms. There is
no way to declare methods using SUBRs or FSUBRs (the existing SUBR
methods are initialized at load time).

>
> Minor Mysteries:
> ----------------

>Why doesn't xlevform trace FSUBRs?  Is this a speed hack?
Good question. Probably not a speed hack. You can't trace macros either.

>Why do both xlobj.c:xloinit() and xlobj.c:obsymvols() initialize the
>"object" and "class" variables?

xloinit creates the classes class and object, as well as the symbols, but
sets the C variables class and object to point to the class and object.

obsymbols just set the C variables by looking up the symbols. It is needed
because when you restore a workspace you don't create new objects but still
need to know where the existing objects are (they might be in a different
location in the saved workspace). Notice that obsymbols is called by xlsymbols
which is called both when initializing a new workspace or restoring an old
workspace.


Tom Almy
toma@tekgvs.labs.tek.com
Standard Disclaimers Apply