Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!cbosgd!ihnp4!mhuxn!mhuxr!mhuxt!houxm!whuxl!whuxlm!akgua!usl!elg
From: elg@usl.UUCP (Eric Lee Green)
Newsgroups: net.lang
Subject: Re: What's so good about FORTH?
Message-ID: <777@usl.UUCP>
Date: Wed, 18-Jun-86 16:09:33 EDT
Article-I.D.: usl.777
Posted: Wed Jun 18 16:09:33 1986
Date-Received: Fri, 20-Jun-86 05:32:38 EDT
References: <201@pyuxv.UUCP> <3700003@uiucdcsp> <132@vaxb.calgary.UUCP>
Reply-To: elg@usl.UUCP (Eric Lee Green)
Distribution: na
Organization: USL, Lafayette, LA
Lines: 59
Keywords: FORTH, threaded-code

In article <634@ucbcad.BERKELEY.EDU> keppel@pavepaws.UUCP (David Keppel) writes:
>>* small size : theaded-code is about a small as you can get.
>
>Pardon, but I've never quite understood what threaded-code is.
>Could somebody give me an explanation of
>	o what it is
>	o why it is fast
>	o why other major languages don't use it or don't admit to using it

I'm not really a FORTH expert, just a sometime user who's written a
couple of little things in FORTH, but:

Subroutines are composed of a list of subroutine addresses... each
subroutine address corresponds to one FORTH word. Or, a subroutine is
composed of ML. To execute a subroutine, there's two choices:

If it's an assembly language (ML) subroutine, directly execute it.
Only a few routines at the very bottom of the pile are written in ML,
mostly things like "+", "-", etc.

If it's a list of subroutine addresses, execute the individual
subroutines in that list, in the same way as above (i.e., recurse).

Needless to say, there's several ways to thread subroutines. A really
fast implementation is to just make the list of subroutine addresses a
ML list of subroutine calls. However, on a small computer like a 6502,
that takes 3/2 more space of just storing the addresses. On a VAX I
wouldn't care, on a C-64, well... The method commonly used is to have
an inner interpreter which fetches an address from the list of
addresses, looks at the header at that address to detirmine if it's an
ML or address list, and if it's a ML routine do a jsr to it, else
stack the pseudo-interpreter's current "program counter" and jump back
to the beginning of the list.

For things like branches and loops, what the called code does is alter
the return address on the interpreter address stack, or just pop the
address stack to the interpreter address counter (= a "return" in
threaded code).

And that's how that's done. It's probably about 30% slower than
writing it in assembly language, because of all the overhead. On the
other hand, it's extremely compact code, and fits the FORTH language very
well.

Why doesn't any other language use it? Traditionally, FORTH has been
run on machines with extremely small main memory stores, where the
memory savings was more than worth the overhead of going to a threaded
interpreter. Compilers for large machines haven't had to worry about
such things, plus the large machine's architecture is more fitted to
compact compiler output. It's interesting to note that almost every
compiler I've seen for 6502 machines produces a P-Code-like code,
which is executed in a similiar fashion (the P-Code interpreter
interprets the P-Code as an index into a list of addresses to which it
then does a jmp).
-- 
Computing from the Bayous,
       Eric Green {akgua,ut-sally}!usl!elg
            (Snail Mail P.O. Box 92191, Lafayette, LA 70509)