Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.milw.wisc.edu!cs.utexas.edu!uunet!zephyr.ens.tek.com!tekcrl!tekgvs!toma From: toma@tekgvs.LABS.TEK.COM (Tom Almy) Newsgroups: comp.lang.forth Subject: Re: Forth Compilation (again) ? Message-ID: <5688@tekgvs.LABS.TEK.COM> Date: 2 Aug 89 14:41:04 GMT References: <114600003@uxa.cso.uiuc.edu> Reply-To: toma@tekgvs.LABS.TEK.COM (Tom Almy) Organization: Tektronix, Inc., Beaverton, OR. Lines: 97 In article <114600003@uxa.cso.uiuc.edu> ews00461@uxa.cso.uiuc.edu writes: >I've seen Forth systems that generate object code. How is usually done ? >Inline code ? Lots of jsr (subroutine calls) ? Are they simply using >the dictionary as a "symbol table" ? >I'm interested in any discussion at all on this, if you don't know how >it IS done, how would you do it ? About seven years ago I was wondering why no one ever compiled Forth, so I set about writing a Forth compiler. The first one I did (using MMS Forth for the TRS-80) took about 10 screens of code. I later embelished it into a 120 screen version that is now part of LMI Forth products (Z-80, 8086, and 80386 based, five diferent versions). 1. A new keyword COMPILE: is used instead of : in the definition. While the definition is basically an unchanged colon def, it compiles into the dictionary like a CODE word. An additional keyword CDOES> is used in place of DOES>, and generates code like ;CODE. An additonal keyword genereated PROCs (standard asm subroutines) which could be efficiently called from other COMPILE: words. 2. About 110 Forth primitives compile directly into machine code. This includes all of the control structure words and arithmetic functions as well as most of the memory referencing words. I have an extension to compile inline 8087 or 80387 code for about a dozen floating point primitives. 3. Variables, constants and equates (sometimes known as VARs or TO variables) are compiled directly into machine code. 3A. (I forgot to add...) any unrecognized words cause an inline change to threaded mode. Compilation reverted to threaded mode when a recognized word was parsed. 4. Up to two registers are used to hold top of stack values. The compiler keeps track of how many and where. A loop index can be kept in a third register. Additional registers are used as temporaries. A lookahead technique is used to merge primitive operations into single instructions. For instance, if A B and C are integers, A @ B @ + C ! would compile into: MOV AX, WORD PTR A ADD AX, WORD PTR B MOV WORD PTR C, AX There is also optimization performed on comparison operations followed by conditional operations. A @ B @ > IF generates 3 instructions. 5. Performance improvement runs 2-7x, with code size expansion 0% - 20% typically. I also wrote a batch Forth compiler CFORTH, for Z-80 and 8086. This compiler, written in Forth, was based on the Native Code Compiler, discussed above. Major features: 1. Colon defintions could be directed to the target (the program being compiled) or the host (the compiler itself) allowing the writing of "macros" which can algorithmically generate data structures. A parameterless macro facility was also provided. 2. A "compile only to resolve references" keyword allowed libraries. The parts of Forth-83 supplied in subroutine form, as well as DOS and other libraries were supplied in source form, and read in with each compile. 3. An assembler was provided. 4. ROMMable generation. Z-80 version allowed RAM and ROM segments. 8086 version allowed separate CODE DATA and STACK segments. For MS/DOS use, TINY model (.COM file) and SMALL model (.EXE file) are supported, and the stack segment can be separate in either case. 5. A multitasker allowed multiple execution threads (separate stack segments but shared code and data segments). A facility allowed independent task variables (USER VARIABLES in Forthspeak). 6. Compilation time was on par with Borland TURBO languages for similar programs. Execution speed was slightly better than Turbo-C or MSC, and code size much smaller than any other compiler I've seen. Typically smaller than Forth! (No headers, remember!). CFORTH is still available, but has never really sold well. I use it all the time because it simply is the fastest, most compact compiler around (btw, the exe file size is about 32k!). I feel that it died because: 1. Lack of promotion -- who has heard about it? 2. Forth programmers have a distaste for batch compilers. 3. Other language uses (who like(?) batch compilers) have a distaste for Forth. Tom Almy toma@tekgvs.labs.tek.com Yes, I have a commercial interest in this, but the interest does not extend to my employer, though.