Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!amdcad!cayman!tim From: tim@cayman.amd.com (Tim Olson) Newsgroups: comp.lang.forth Subject: Re: FORTH, RISC, and other new architectures Message-ID: <26573@amdcad.AMD.COM> Date: 2 Aug 89 00:55:17 GMT References: <385@ryn.esg.dec.com> <975@key.COM> Sender: news@amdcad.AMD.COM Reply-To: tim@amd.com (Tim Olson) Organization: Advance Micro Devices, Inc., Austin, Texas Lines: 43 Summary: Expires: Sender: Followup-To: In article <975@key.COM> ken@samaria.key.COM (Ken Kofman) writes: | It would be extremely difficult for FORTH to take full advantage of | the large register files the RISC chips provide, unless a FORTH | compiler is provided, or the programmer resorts to assembler :(. | Even on the transputer, which, at first glance, seems particularly | well suited to FORTH, there must be a good deal of time wasted | maintaining the stack- -the transputer's stack is only three deep. Yes, large standard register files are not very effective in speeding up FORTH. However, the Am29000 register file is different. There are 128 local registers, which are offset from the register stack pointer (global register 1). Thus, the local register file can act like a stack cache, holding the top (or all!) of the operand stack. FORTH primatives (compiled in-line) then look like: + DUP SWAP add lr1, lr1, lr0 add lr127, lr0, 0 add tmp, lr0, 0 add gr1, gr1, 4 sub gr1, gr1, 4 add lr0, lr1, 0 add lr1, tmp, 0 with lr0 pointing to the top of stack, lr1 the next of stack, etc. A "smart" FORTH compiler would be able to remove the extraneous stack-adjustment instructions and change the local variable references accordingly. The FORTH notion of separate operand and control stacks can also be implemented by splitting the local register file into two 64-register chunks, and keeping two shadow stack pointers, swapping the correct pointer into the real stack pointer when required. This raises a couple of questions: First, are 64 words of control and operand stack enough? How deeply nested to FORTH words typically get? Second, does a typical FORTH program (coded with in-line primatives and call-threaded) spend most of its time executing FORTH primatives or executing the thread of control for user-written words? -- Tim Olson Advanced Micro Devices (tim@amd.com)