Path: utzoo!utgpu!watmath!att!rutgers!ucsd!sdcc6!ir230 From: ir230@sdcc6.ucsd.edu (john wavrik) Newsgroups: comp.lang.forth Subject: Forth Implementation (long) Keywords: Forth implementation Message-ID: <5172@sdcc6.ucsd.edu> Date: 17 Nov 89 01:37:23 GMT Organization: University of California, San Diego Lines: 543 Bill Bouma writes: > I fail to see what all this has to do with implementing forth in > assembly language? I see no reason a forth interpreter written in > any other language could not be made to behave identically to one > written in assembly! > Also, I have to disagree with the rest of what was said because the > main thing that distinguishes forth from other languages is the > parameter stack. Other languages (yes, even C) provide means to > build new data types using structures. One can build new operations > by writing functions. A language which does not allow the user to add > to it would not be very useful! I'd like to point out to everyone how this exchange began: As a person who has a lot of experience with Forth, I received from a "proud papa" a version of Forth written in 'C' (I will refer to it as XFORTH). It was lacking some critical words (in particular the double precision wordset) and had others improperly implemented (like UM*), which prevented me from testing it on existing applications. I went to add the missing words and fix the errors (a trivial task on traditional Forth's). When I tried, I found that mechanisms to add the words were lacking too. This inspired some thinking about the problems inherent in trying to implement Forth using other languages. The precise statement of mine was: # I'd be happy to see a full Standard version of Forth implemented in 'C' # which allows a user the same control as he enjoys in an assembly language # version (in particular the ability to add new primitives without # understanding 'C', the 'C' source code for the implementation, and how his # 'C' compiler works). Failing this, I can only conclude that the purpose of # implementing Forth in 'C' is to provide 'C' programmers with a toy version # of Forth. It's a shame that people homed in on the second sentence rather than the first. ---- Before we go any further, I'd like to take a vote: All Forth programmers who believe that "the main thing that distinguishes Forth from other languages is the parameter stack" please raise your hand. [see Bill, none of them do!] ---- What would a 'C' programmer do if he received from a friend a program which used declared variables of type "long" [note for Forth programmers: this is what they call double precision integers] and his 'C' compiler does not support this type? a. Ask your friend to recode her program b. Buy a new 'C' compiler c. Modify your 'C' compiler, even though you don't have the source code d. Write a note to comp.lang.c telling everyone that you have a "toy" compiler e. Let off steam by writing to comp.lang.forth and harassing the Forth programmers Most 'C' programmers would do b. and e. (they have too much pride to do a. or d.) Almost any Forth programmer (using a traditional implementation of Forth) would do the equivalent of c. Please note that the problem is to add the data type "long" in such a way that it has the same status as built-in types. Treating a "long" as an array or a "struct" and defining functions like long_add would not do! [to Pascal programmers: a struct is what they call a record] [to Forth programmers: we don't have one, but by the time someone has finished explaining what it is, you'll have figured out how to make one] You want to read, without alteration, the source code of a person who has "long" as a primary type. She has declarations like: long x,y,z; and statements like: z = x + y; or worse: z = x + w; (with w an "int"). Her compiler recognizes "long" as a primary type (and handles the attendant overloading of symbols and coercion), yours doesn't! [to 'C' programmers: in order not to lose you for the rest of this message, take my word for it: you can't add a new structure like this without (1) changing the compiler, (2) using your compiler to write a new compiler, (3) writing a pre-processor -- which is almost as hard as (2). Please read on and convince yourself later] ------- Let's look at the task of defining a double-precision addition in Forth. Here is the definition in F-83 due to Henry Laxen: \ 32 bit Arithmetic Operations 11OCT83HHL CODE D+ (S d1 d2 -- dsum ) AX POP DX POP BX POP CX POP CX DX ADD BX AX ADC 2PUSH END-CODE Besides being as fast as possible, this definition is quite simple. I would rank this as very low in difficulty -- ANY Forth programmer could do a definition like this. (I think Henry Laxen is a very imaginative and creative person -- but I don't think this is an example. While I don't teach assembly language in my course, any of the students who pass my course would be capable of doing this as one problem on a weekly homework (part of the problem being to learn enough about assembly language to do it). For comparison, let's look at the task of doing this in high-level Forth. The definition of D+ itself is simple -- we want to add the low order words together and then add the high order words WITH THE CARRY FROM THE LOW ORDER: : D+ ( d1 d2 -- d3 ) ROT >R >R ( high order on top ) 0 ADC R> R> ROT ADC DROP ; Unfortunately, high level Forth (just like high level anything else) does not provide us access to the carry flag -- so we must define the word ADC ( n1 n2 c -- n3 c' ;;; add with carry ) which adds with a carry but does not use the hardware carry flag. In the code below, we have synthesized this by treating 16-bit addition as double- precision 8-bit addition. To make things easier to understand, we introduce a collection of pseudo-registers to hold the low and high bytes of intermediate numbers. [To Forth programmers: I know I've used too many variables, but I'm trying to make a point] The word SPLIT ( n i -- ) splits the 16-bit integer n into high and low bytes and stores these in the i-th slots of the arrays. The only other thing to explain is that << and >> are shift words provided in XForth: << ( n k -- n' ) n' is n (logically) shifted left by k bits. CREATE LOW0 4 CELLS ALLOT CREATE HIGH0 4 CELLS ALLOT : 'LOW ( i -- addr-low ) 1- CELLS LOW0 + ; : 'HIGH ( i -- addr-high ) 1- CELLS HIGH0 + ; : LOW 'LOW @ ; : HIGH 'HIGH @ ; : SPLIT ( n i -- ;;; store ) >R DUP 255 AND R@ 'LOW ! 8 >> R> 'HIGH ! ; : ADC ( n1 n2 c -- n3 c' ;;; add with carry ) >R 2 SPLIT 1 SPLIT 1 LOW 2 LOW + R> + 3 SPLIT \ 3 LOW has sum of low bytes, 3 HIGH has carry 1 HIGH 2 HIGH + 3 HIGH + 4 SPLIT \ 4 LOW has sum of high bytes, 4 HIGH has carry 4 LOW 8 << 3 LOW + 4 HIGH ; In contrast to the low-level solution this is slow, complicated, and ugly. (I wrote it to see what would be involved and make no claims that optimal in speed or beauty.) It runs about 60 times slower than the assembly language when both are run under F-83. It is 240 times slower when running on XForth. It is not something an average Forth programmer can be expected to do (it takes a brown belt at least). > The thing that bothers me about this is that you seem to be implying > that the user knows the host assembly language? If not, then what is > the difference between having your forth core written in assembly, or > having it written in basic? Speed! I think that the misconception is thinking of Forth as a program which is written in assembly language. Most programming languages produce a program which is compiled to leave behind "THE RESULT". It is immaterial to someone who runs "THE RESULT" which language was used to produce it. The choice of language is more a matter of convenience to the programmer. Forth is the assembly language for a virtual machine which is mapped to an underlying real processor. It is a combination extensible compiler, extensible language, and extensible operating system WHICH IS MOSTLY WRITTEN IN ITSELF. What I have called "traditionally implemented Forth" is a version of Forth in which a great many words written in Forth itself are mixed with a few words written in the assembly language of the host processor. I have included the entire double-precision wordset for F-83 as an appendix. It has a higher proportion of assembly language words than normal. It should nevertheless be clear that the definitions of all words, both colon and code, are short and simple. A more typical illustration is the I/O interface (including keyboard, video, and files) which, in F-83, consists entirely of high level Forth words based upon a single code primitive: BDOS Load up the registers and do a DOS system call. return the result placed in the A register on the stack. CODE BDOS (S n fun -- m ) AX POP AL AH MOV DX POP 33 INT AH AH SUB 1PUSH END-CODE (INT for "interrupt" is a call to either the BIOS ROM or the operating system. There are a number of books that list all the available interrupt calls stating what they do, what has to be loaded into which registers, and what is in which registers upon return. Any Forth programmer who knows how to read can integrate OS hooks into a Forth program provided that he knows which registers are used by Forth and how the stack is implemented. In F-83, the machine stack is the Forth parameter stack, BP is the return stack pointer, and SI is the Forth instruction pointer -- everything else is free within a word) Traditional Forth is not "implemented in assembly language". It is really implemented in Forth -- but Forth is so closely mapped to the underlying processor that key words can be easily defined in terms of the host's assembly language. In addition to the use of assembly language, a major advantage of the traditional implementation is that it provides the user with access to the details of the implementation. The fact that a user can extend the compiler as part of programming has a *great* deal to do with ease of programming. It is important to understand that the Forth approach to problem solving is radically different than the conventional approach. In Forth we produce a problem-oriented application language and then attack the problem. Thus Forth is more a toolkit for building languages than a language. It would be unfair to say that a great deal of Forth programming requires the use of assembly language or access to the implementation. But it would be fair to say that the 5% or 10% which does use them is responsible for many of the claims made by Forth programmers that they can easily achieve things that would be difficult (or impossible) to do in conventional languages. Please do not assume that it takes a great deal of assembly language knowledge to make major alterations in a traditionally implemented Forth. Several years ago, when I used Kitt Peak VAX-Forth for my class, I wanted to make it look like the dialect of Forth found in the first edition of Brodie's "Starting Forth". Kitt Peak was a traditionally implemented overgrown FIG-Forth and I had to make it look like Brodie's anticipation of Forth-83. The good people at Kitt Peak had found (in 1982) ways to make their 32-bit VAX version compatible with their 16-bit PDP-11 version. Their Forth included "everything but the kitchen sink" -- in particular floating point and double-precision floating point numbers and a lot of "better ideas" like prefacing all byte operations by "b" rather than "c"). I had to add a few missing words ( Why not? What do registers have to do with it? If you are writing from > "within the forth environment" you are writing forth, right? Then why > does it matter what that forth was written in? I hope I've answered this. John J Wavrik jjwavrik@ucsd.edu Dept of Math C-012 Univ of Calif - San Diego La Jolla, CA 92093 P.S. I don't understand how this newsgroup has become split into two subgroups. On the one hand we have the people who want to know how many pins will be on the RTX-3000 (and Phil Koopman who replies by supplying the phone number of Harris, Inc). On the other, we have 'C' programmers who are intent at modifying a language they haven't bothered to understand. Apparently the people who are doing interesting and portable things in Forth are too busy to take the time to tell us about them and post some source code. ============================================================================ APPENDIX -- F-83 DOUBLE PRECISION WORDSET shadow screen comments are indented \ 16 bit Arithmetic Operations Unsigned Multiply 26Sep83map You could write a whole book about multiplication and division, and in fact Knuth did. Suffice it to say that UM* is the basic multiplication primitive in Forth. It takes two unsigned 16 bit integers and returns an unsigned 32 bit result. All other multiplication functions are derived from this primitive one. It probably isn't particularly fast or elegant, but that is because I never liked arithmetic and I stole this implementation from FIG Forth anyway. U*D is a synonym for UM* \ 16 bit Arithmetic Operations Unsigned Multiply 22Aug83map CODE UM* (S n1 n2 -- d ) AX POP BX POP BX MUL DX AX XCHG 2PUSH END-CODE : U*D (S n1 n2 -- d ) UM* ; \ 16 bit Arithmetic Operations Division subroutines 05MAR83HHL These are various subroutines used by the division primitive in Forth, namely U/. Again I must give credit for them to FIG Forth, since if I can't even understand multiply, divide would be completely hopeless. \ 16 bit Arithmetic Operations Unsigned Divide 05MAR83HHL UM/MOD This is the division primitive in Forth. All other division operations are derived from it. It takes a double number, d1, and divides by by a single number n1. It leaves a remainder and a quotient on the stack. For a clearer understanding of arithmetic consult Knuth Volume 2 on Seminumerical Algorithms. \ 16 bit Arithmetic Operations Unsigned Divide 22Aug83map CODE UM/MOD (S d1 n1 -- Remainder Quotient ) BX POP DX POP AX POP BX DX CMP >= ( divide by zero? ) IF -1 # AX MOV AX DX MOV 2PUSH THEN BX DIV 2PUSH END-CODE \ 16 bit Comparison Operations 05MAR83HHL 0= Returns True if top is zero, False otherwise. 0< Returns true if top is negative, ie sign bit is on. 0> Returns true if top is positive. 0<> Returns true if the top is non-zero, False otherwise. = Returns true if the two elements on the stack are equal, False otherwise. <> Returns true if the two element are not equal, else false. ?NEGATE Negate the second element if the top is negative. \ 16 bit Comparison Operations 04OCT83HHL ASSEMBLER LABEL YES TRUE # AX MOV 1PUSH LABEL NO FALSE # AX MOV 1PUSH CODE 0= (S n -- f ) AX POP AX AX OR YES JE NO #) JMP END-CODE CODE 0< (S n -- f ) AX POP AX AX OR YES JS NO #) JMP END-CODE CODE 0> (S n -- f ) AX POP AX AX OR YES JG NO #) JMP END-CODE CODE 0<> (S n -- f ) AX POP AX AX OR YES JNE NO #) JMP END-CODE CODE = (S n1 n2 -- f ) AX POP BX POP AX BX CMP YES JE NO #) JMP END-CODE : <> (S n1 n2 -- f ) = NOT ; : ?NEGATE (S n1 n2 -- n3 ) 0< IF NEGATE THEN ; \ 16 bit Comparison Operations 27Sep83map U< Compare the top two elements on the stack as unsigned integers and return true if the second is less than the first. Be sure to use U< whenever comparing addresses, or else strange things will happen beyond 32K. U> Compare the top two elements on the stack as unsigned integers. True if n1 > n2 unsigned. < Compare the top two elements on the stack as signed integers and return true if n1 < n2. > Compare the top two elements on the stack as signed integers and return true if n1 > n2. MIN Return the minimum of n1 and n2 MAX Return the maximum of n1 and n2 BETWEEN Return true if min <= n1 <= max, otherwise false. WITHIN Return true if min <= n1 < max, otherwise false. \ 16 bit Comparison Operations 11OCT83HHL ASSEMBLER LABEL YES TRUE # AX MOV 1PUSH CODE U< (S n1 n2 -- f ) AX POP BX POP AX BX CMP YES JB NO #) JMP END-CODE CODE U> (S n1 n2 -- f ) AX POP BX POP BX AX CMP YES JB NO #) JMP END-CODE CODE < (S n1 n2 -- f ) AX POP BX POP AX BX CMP YES JL NO #) JMP END-CODE CODE > (S n1 n2 -- f ) AX POP BX POP AX BX CMP YES JG NO #) JMP END-CODE : MIN (S n1 n2 -- n3 ) 2DUP > IF SWAP THEN DROP ; : MAX (S n1 n2 -- n3 ) 2DUP < IF SWAP THEN DROP ; : BETWEEN (S n1 min max -- f ) >R OVER > SWAP R> > OR NOT ; : WITHIN (S n1 min max -- f ) 1- BETWEEN ; \ 32 bit Memory Operations 09MAR83HHL 2@ Fetch a 32 bit value from addr. 2! Store a 32 bit value at addr. \ 32 bit Memory Operations 13Apr84map CODE 2@ (S addr -- d ) BX POP 0 [BX] AX MOV BX INC BX INC 0 [BX] DX MOV 2PUSH END-CODE CODE 2! (S d addr -- ) BX POP 0 [BX] POP BX INC BX INC 0 [BX] POP NEXT END-CODE \ 32 bit Memory and Stack Operations 26Sep83map 2DROP Drop the top two elements of the stack. 2DUP Duplicate the top two elements of the stack. 2SWAP Swap the top two pairs of numbers on the stack. You can use this operator to swap two 32 bit integers and preserve their meaning as double numbers. 2OVER Copy the second pair of numbers over the top pair. Behaves like 2SWAP for 32 bit integers. 3DUP Duplicate the top three elements of the stack. 4DUP Duplicate the top four elements of the stack. 2ROT rotates top three double numbers. \ 32 bit Memory and Stack Operations 11OCT83HHL CODE 2DROP (S d -- ) AX POP AX POP NEXT END-CODE CODE 2DUP (S d -- d d ) AX POP DX POP DX PUSH AX PUSH 2PUSH END-CODE CODE 2SWAP (S d1 d2 -- d2 d1 ) CX POP BX POP AX POP DX POP BX PUSH CX PUSH 2PUSH END-CODE CODE 2OVER (S d2 d2 -- d1 d2 d1 ) CX POP BX POP AX POP DX POP DX PUSH AX PUSH BX PUSH CX PUSH 2PUSH END-CODE : 3DUP (S a b c -- a b c a b c ) DUP 2OVER ROT ; : 4DUP (S a b c d -- a b c d a b c d ) 2OVER 2OVER ; : 2ROT (S a b c d e f - c d e f a b ) 5 ROLL 5 ROLL ; \ 32 bit Arithmetic Operations 05MAR83HHL D+ Add the two double precision numbers on the stack and return the result as a double precision number. DNEGATE Same as NEGATE except for double precision numbers. S>D Take a single precision number and make it double precision by extending the sign bit to the upper half. DABS Return the absolute value of the 32 bit integer on the stack \ 32 bit Arithmetic Operations 11OCT83HHL CODE D+ (S d1 d2 -- dsum ) AX POP DX POP BX POP CX POP CX DX ADD BX AX ADC 2PUSH END-CODE CODE DNEGATE (S d# -- d#' ) BX POP CX POP AX AX SUB AX DX MOV CX DX SUB BX AX SBB 2PUSH END-CODE CODE S>D (S n -- d ) AX POP CWD AX DX XCHG 2PUSH END-CODE CODE DABS (S d# -- d# ) DX POP DX PUSH DX DX OR ' DNEGATE @-T JS NEXT END-CODE \ 32 bit Arithmetic Operations 06Apr84map D2* 32 bit left shift. D2/ 32 bit arithmetic right shift. Equivalent to divide by 2. D- Subtract the two double precision numbers. ?DNEGATE Negate the double number if the top is negative. \ 32 bit Arithmetic Operations 06Apr84map CODE D2* (S d -- d*2 ) AX POP DX POP DX SHL AX RCL 2PUSH END-CODE CODE D2/ (S d -- d/2 ) AX POP DX POP AX SAR DX RCR 2PUSH END-CODE : D- (S d1 d2 -- d3 ) DNEGATE D+ ; : ?DNEGATE (S d1 n -- d2 ) 0< IF DNEGATE THEN ; \ 32 bit Comparison Operations 01Oct83map D0= Compare the top double number to zero. True if d = 0 D= Compare the top two double numbers. True if d1 = d2 DU< Performs unsigned comparison of two double numbers. D< Compare the top two double numbers. True if d1 < d2 D> Compare the top two double numbers. True if d1 > d2 DMIN Return the lesser of the top two double numbers. DMAX Return the greater of the the top two double numbers. \ 32 bit Comparison Operations 01OCT83MAP : D0= (S d -- f ) OR 0= ; : D= (S d1 d2 -- f ) D- D0= ; : DU< (S ud1 ud2 -- f ) ROT SWAP 2DUP U< IF 2DROP 2DROP TRUE ELSE <> IF 2DROP FALSE ELSE U< THEN THEN ; : D< (S d1 d2 -- f ) 2 PICK OVER = IF DU< ELSE NIP ROT DROP < THEN ; : D> (S d1 d2 -- f ) 2SWAP D< ; : DMIN (S d1 d2 -- d3 ) 4DUP D> IF 2SWAP THEN 2DROP ; : DMAX (S d1 d2 -- d3 ) 4DUP D< IF 2SWAP THEN 2DROP ; \ Mixed Mode Arithmetic 27Sep83map This does all the arithmetic you could possibly want and even more. I can never remember exactly what the order of the arguments is for any of these, except maybe * / and MOD, so I suggest you just try it when you are in doubt. That is one of the nice things about having an interpreter around, you can ask it questions anytime and it will tell you the answer. *D multiplys two singles and leaves a double. M/MOD divides a double by a single, leaving a single quotient and a single remainder. Division is floored. MU/MOD divides a double by a single, leaving a double quotient and a single remainder. Division is floored. \ Mixed Mode Arithmetic 04OCT83HHL : *D (S n1 n2 -- d# ) 2DUP XOR >R ABS SWAP ABS UM* R> ?DNEGATE ; : M/MOD (S d# n1 -- rem quot ) ?DUP IF DUP >R 2DUP XOR >R >R DABS R@ ABS UM/MOD SWAP R> ?NEGATE SWAP R> 0< IF NEGATE OVER IF 1- R@ ROT - SWAP THEN THEN R> DROP THEN ; : MU/MOD (S d# n1 -- rem d#quot ) >R 0 R@ UM/MOD R> SWAP >R UM/MOD R> ; \ 16 bit multiply and divide 27Sep83map */ is a particularly useful operator, as it allows you to do accurate arithmetic on fractional quantities. Think of it as multiplying n1 by the fraction n2/n3. The intermediate result is kept to full accuracy. Notice that this is not the same as * followed by /. See Starting Forth for more examples. \ 16 bit multiply and divide 04OCT83HHL : * (S n1 n2 -- n3 ) UM* DROP ; : /MOD (S n1 n2 -- rem quot ) >R S>D R> M/MOD ; : / (S n1 n2 -- quot ) /MOD NIP ; : MOD (S n1 n2 -- rem ) /MOD DROP ; : */MOD (S n1 n2 n3 -- rem quot ) >R *D R> M/MOD ; : */ (S n1 n2 n3 -- n1*n2/n3 ) */MOD NIP ;