Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!willett!ForthNet From: ForthNet@willett.UUCP (ForthNet articles from GEnie) Newsgroups: comp.lang.forth Subject: Forth Implementation Message-ID: <503.UUL1.3#5129@willett.UUCP> Date: 18 Feb 90 23:42:11 GMT Organization: Latest link in the ForthNet chain. (Pgh, PA) Lines: 97 Category 3, Topic 24 Message 51 Sat Feb 17, 1990 R.BERKEY [Robert] at 18:57 PST To: David Albert Re: : (colon) Data Structures Threading Tradeoffs > ...I have seen that several implementations of Forth use a small > "inner interpreter loop" using DS:SI for example as the > instructioni pointer. I chose just to use CALL and RET as the > entry and exit to my words. Therfore, CS:IP is my instruction > pointer and word pointer. Here's the question: Why do people use > the separate inner interpreter loop? It seems that the call and > return are much more flexible and that I can more easily manipulate > return addresses since they are just on the stack. I use BP for my > parameter stack pointer. This gets into the whole issue of varieties of implementations of colon. To review, the basic varieties of Forth threading techniques have been called, in increasing order of abstractness: native code compilation jsr threaded direct threaded indirect threaded token threaded Native code compilation is just the usual mix of code that an assembler and an ordinary compiler produce. This may get called other things like direct machine compilation. A Forth native code compilation may have lots of calls intermixed with short runs of low level code. Depending on viewpoint, this may or may not be considered a threading technique. What you've implemented sounds like it might be related to the class of JSR (jump subroutine) threading, where the body of a colon definition contains a sequence of calls. JSR threading is related to native code compilation in that the processor looks at them in the same way. The structural difference is such that a JSR threaded system can be compliant to the Forth-83 Standard, while a native code compilation is not. A Forth-83 implementation could also have a native code compiler, but this would be there in addition to the : (colon) compiler. The names "direct threaded" (DTC) and "indirect threaded" (ITC) were criticized on technical grounds in an early Forth Dimensions but the names have stuck. Direct threading gets a code field added to the body of the colon definition. The code field is directly executable, although often one register must be set before executing the code field. One key answer to your query is that compiling a compilation token on an 80188 jsr threaded system takes three bytes, whereas compiling a compilation token with DTC, ITC, TTC, etc., takes two bytes--a potential for substantial reduction in code size. Indirect threading means that the code field, instead of being executable, contains the address of executable code. The Forth-79 Standard restricted implementations of : to indirect threading. Token threading (TTC) has several variants. It may add one more level of indirection through a table of pointers, to a table of pointers to code. With token threading, addresses can be completely isolated from the main body of code, making relocatability easy. Specific machine architectures lead to more variations on the above, including segment threading (SgTC) on the 8086, and a 68000 "token" threading in which the table is accessed by the architecture and the thread is directly executable. It might seem at first glance that these systems would get slower the more abstract they get. But then consider that in a JSR system NEXT is RET CALL . That's two bytes of opcode, which reads from memory four bytes of addresses, and writes two bytes, for a total of eight bytes of memory access. Meanwhile an 80188 direct threaded system with an inline NEXT of LODSW AX JMP has three bytes of opcode, which reads two bytes of address, for a total of five bytes of memory access. Processors including the PDP-11 and HP2100(?) have single-opcode instructions that can perform an indirect-threaded NEXT . Its easy to see that the speed tradeoffs can get interesting. Like you suggest, there are many other tradeoffs. I've sometimes wondered about the efficiency tradeoffs of having the return stack the default 80x8x SP stack. Related to your comment about ease of manipulating return addresses, one technique that's used is SP, BP XCHG to get at the return stack. One thing I find interesting about JSR is that it clarifies that a Forth IP register, (DS:SI or whatever), is really a part of the return stack. Now as for how all this compares with what one discovers when reading up on TIL's, I wouldn't know, but would be interested. Robert ----- This message came from GEnie via willett through a semi-automated process. Report problems to: 'uunet!willett!dwp' or 'willett!dwp@gateway.sei.cmu.edu'