Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!iuvax!uxc.cso.uiuc.edu!uxc.cso.uiuc.edu!ux1.cso.uiuc.edu!uicsrd.csrd.uiuc.edu!jaxon From: jaxon@uicsrd.csrd.uiuc.edu Newsgroups: comp.lang.apl Subject: Re: APL Machines Message-ID: <49700014@uicsrd.csrd.uiuc.edu> Date: 11 Sep 89 16:26:00 GMT References: <153557@<1989Sep5> Lines: 79 Nf-ID: #R:<1989Sep5:153557:uicsrd.csrd.uiuc.edu:49700014:000:3998 Nf-From: uicsrd.csrd.uiuc.edu!jaxon Sep 11 11:26:00 1989 > There was an APL machine (by that name) produced by a corporation > somewhere on the east coast of the US. It was based on 68000 > processors,... You're thinking of Analogic Corporation's 68000/CATscanner hybrid. The 68000s were just control processors for the interpreter, most primitives were microcoded on a 12-bit x 8 element Array Processor which had been scavenged from one of Analogic's medical imaging instruments. I believe it's still on the market, although no plans exist to upgrade it. The interpreter and user interface are excellent. The CAT scanner part is a merciless number cruncher, so fast that you won't notice the effects of long vector lengths until well past 1000 elements. (i.e. it is several hundred times faster than a 68000). There is a lesson in the APL Machine design, though. The array processor is not enough! Despite the implementer's heavy use of the AP (e.g. linear search on the AP always outperformed hash table lookup on the 68000), the 68000 was a constant damper on program speed. Memory management of small vectors and scalars is not an especially parallelizable aspect of APL. The uniprocessors responsible for serial sections of the interpreter, and whatever features are used to synchronize the parallel sections are absolutely critical elements in an APL supercomputer. > The "equals" primitive could be written in APL : ...[buggy code omitted] Several interpreters have tried using "magic functions" to produce new primitives from old. It is never as easy as it looks, and it is NEVER fast - it's not even tolerably slow. 1) Your definition is wrong -- you must take absolute values before comparing the arguments. 2) Once the correct definition is written, you must make it work even when the intermediate terms exceed the number system's limits. 3) By now you've got a function that works for simple scalars. To call it you'll have to create two scalar APL objects, and a stack frame (that's invisible in the caller's ")SI"). You'll have to make a class of APL function capable of returning into a primitive algorithm at the correct place. This does not compare favorably with a single instruction for Tolerant Equals. I'm not a great fan of #CT and its consequences, but it is STANDARD and heavily relied upon, and it is one more language-specific hardware feature that APL could really use. > "Dictionary APL" NOT the "APL2" dialect. Firstly I'd urge any APL designer to become deeply familiar with BOTH these language definitions, and to really use BOTH systems. The dictionary approach ("function rank") is really a wonderful perfection of the original APL array processing ideas. I suspect it is more efficient to implement, because it is a little less powerful than the equivalent features in APL2. In "Dictionary" APL, operators are much more powerful. If operators can really manipulate the functions (e.g. pipelining them, carrying temporary results in registers, etc.) then I'd say the dictionary approach is best suited to today's vector supercomputers. But "function rank" seems tied to homogeneous arrays (am I wrong here?) There are real limits on programmers' ability to forsee what a function expression will do. In the APL2 approach, all the data decompositions are explicitly written out, you can enter the subexpressions and watch what's happening to your data. You can also do all kinds of unorthodox decompositions of your data, which stand no chance of being vectorizable. And "vectorizing" is not the only hope for APL anyway! Multiple instruction Multiple Data parallel machines are growing in number and power, these don't require that "one function" be in control, that "two arguments" be in memory and that "one type" of result is expected. I think the APL2 approach will provide very rich ground for parallel machine designers. Thanks in advance for any replies! greg jaxon -- jaxon@uicsrd.csrd.uiuc.edu Univ. of Ill. Center for Supercomputing R&D