Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!aplcomm.jhuapl.edu!john From: john@aplcomm.jhuapl.edu (John Hayes) Newsgroups: comp.lang.forth Subject: Re: FORTH ENTINGES/ APL 32 bit Keywords: SC32, stack CPU Message-ID: <1990Dec9.042828.18170@aplcen.apl.jhu.edu> Date: 9 Dec 90 04:28:28 GMT References: <2002.UUL1.3#5129@willett.pgh.pa.us> <1990Nov26.155122.28988@aplcen.apl.jhu.edu> <1139@shakti.ncst.ernet.in> Sender: news@aplcen.apl.jhu.edu (USENET News System) Reply-To: john@aplcomm.jhuapl.edu (John Hayes) Organization: JHU/APL, Laurel, MD Lines: 46 H. Shrikumar writes: > Marty, could you give any pointers to articles/papers about this chip > besides those in the FORTH niche publications you mention... surely > the survival of radically different architecture like the SC32/RTX2000 > is news to lots of other (non-Forth) people .. for eg. comp.arch > will be happy to hear more about the SC32 and its well-being. > Perhaps there are some ASPLOS, or ACM SIG?? or IEEE ??? articles ? > Or can one get a flyer from Silcon Composers ? > There is one paper in ACM/ASPLOS-II Conference (An Architecture > for direct execution of FORTH - John Hayes, Marty Fraeman et al). I > assume the SC32 is a mature descendant of this 4um, 1.5MHz MOSIS > prototype part. You are right. The SC32 is our third generation Forth microprocessor. A sparse data sheet is available from Silicon Composers Inc. A paper describing the processor in some detail appeared in a recent issue of The Journal of Forth Application and Research. A less detailed description of the SC32 appeared about a year ago in Forth Dimensions. We haven't published anything in any "main stream" journals. > For those in comp.arch following the thread about registers/caches ... > the above paper analyses Hoshagawa's (?) cut-back-K algorithm for > stack cacheing. You stack the top N words of cache, the ALU can access > TOS and TOS-1 directly. On an underflow-overflow you read in/out K > words. Optimal is when K=N/2. During the design of the SC32, I did a lot of stack caching simulations. I found that the cut-back-K algorithm analyzed in the ASPLOS paper is inadequate. The analysis assumed that the stack depth does a random walk to conclude that K=N/2 is optimal. My measurements of real programs indicate that in Forth programs, the stack depth stays near a fixed depth for long periods with small oscillations occuring around this depth. This non-random behavior is a product of the repetitous nature of most programs. My new analysis concludes that K=1 is optimal: When the stack cache (buffer) is full, write out one value and when the cache (buffer) is empty, read in one value. We are so confident about this algorithm that we implemented it in hardware on the SC32. I have written a detailed paper (which is still looking for a home) on this study. A summary appears in the JFAR paper mentioned above. John R. Hayes john@aplcomm.jhuapl.edu Applied Physics Laboratory Johns Hopkins University