Path: utzoo!attcan!ncrcan!ziebmef!daniel From: daniel@ziebmef.mef.org (Daniel Albano) Newsgroups: comp.lang.apl Subject: Re: APL Machines Message-ID: <1989Sep19.111751.7613@ziebmef.mef.org> Date: 19 Sep 89 15:17:49 GMT References: <153557@<1989Sep5> <49700014@uicsrd.csrd.uiuc.edu> <22186@cup.portal.com> Reply-To: daniel@ziebmef.mef.org (Daniel Albano) Organization: Ziebmef Public Access Unix, Toronto, Ontario Lines: 123 The question of "hardware optimization" for APL machines requires some consideration of how they will be used. Are we speaking here of single user or multi-user machines? Are we looking at multiple user tasks, or single-workspace machines with synchronous processing (single task/workspace)? Is the machine going to run "APLish" code (data embedded in arrays, large "logically parallel" operations); or will it be used for "third generation language" processing style as tends to predominate in COBOL, dBase, and PASCAL application (mixed data type stored as "records" or in structures accessed through index operations)? Choices in how it will be used - and who is going to be footing the bill :-) will largely determine design choices. Exxon crunching seismic data will have a much different view of the "optimal machine" than I will; probably by a few million dollars. I tend to discount the overheads in handling small (several element to scalar) data structures, because it that is all there is, then the application is unlikely to be large, and there should be lots of spare machine cycles. When the work grows, if the application is well designed, so does the size of your data entities. Making APL hardware could involve implementing a lot of APL as "machine instructions", but that really means microcode - programs written in a constrained environment to squeeze as much as possible out of very elementary operations. Given the raw hardware capabilities (single cycle instruction execution, pipelineing, branch lookahead, etc.) the idea of building what could be termed an ECISC machine (Exceedingly Complicated Instruction Set Computing - and I do mean complicated, not complex) flies in the face of current conventional wisdom on performance. Of course, this would not be the first time that conventional wisdom could be wrong. The structuring of the execution environment by the APL interpreter would be an advantage here - but it is also an advantage in generating highly tuned object code as well - and perhaps an assembler would be easier to work with than a micro-assembler. As is often the case, a middle ground may work well. Perhaps a more or less conventional CISC (or RISC?) machine with a few well chosen enhancements to the instruction set to improve fundamental APL operations would offer advantages without too much sacrifice of development effort and system complexity. The nature of the enhancements depend very much on what you consider the key APL level operations to be. My own list of crucial favourites would include reshape, dyadic iota, compress, reduce, and generalized inner products (or at least and.equals). Operations on Boolean arrays would also be very high on the list, as would comparisons, especially those for integer and character data. Both base value and representation are also important from a performance view in that they should not be overly slow, but good system design can keep their use far below that of the previously mentioned operations. One operation implicit in any APL that is crucial is the allocation of storage for data entities. Function calling, including the creation of local symbols for user defined functions is another thing that can happen very often in a well-structured system. On the other hand, I don't really care that much how fast the trig functions or domino run, nor that much how fast exponentiation and the like are. Someone who does more "calculation" rather than "byte bashing" (commercial/logical systems) would again have a different set of priorities. If I were heavily into graphics in a particular instance I might care about floating point, but in most systems I have seen, comparisons and data manipulation far outweigh calculation. The most crucial item in a lot of cases is just workspace size. An application that can get most or all of its data in a workspace and have enough left over for transient storage for intermediate results will be not only faster, but easier to write and to maintain. I have used workspaces from 32K to several megabytes, and would not willingly go back to the small ones. Of course, my home system has a 750K workspace, so most of my recent code has been very easy - I just don't seem to have too much data in my life :-) ... of course the data compression / storage reduction techniques learned in those 32K workspaces still do a great job in maximizing the capabilities of something more than twenty times larger. Personally, I don't thing (single user machine) that you have to provide multiple "control streams" or simultaneous processing of user level tasks, but some parallelism in the execution of array operations - or a high speed vector processing scheme - could provide a major boost in performance. In many operations on conformal arrays, the (logical) structure of the data is not important during the actual computation step. If you could generalize (should be easy) for scalar/array operations, then you have a large part of the problem in hand. Many other operations could be decomposed into a number of vector, or "quasi-vector" operations. By quasi-vector, I mean that there are a number of elements, regularly and repetitively spaced through the machine's memory, with (temporarily) irrelevant elements interspersed. In real world terms, another major performance boost is a true native APL file system - one that stores the components in their internal representaion. I did work with one system that only stored what were essentially character representations of the data (two dimensional? memory is mercifully weak on the details) and a lot of time was spent converting to and from "file format". This is probably too peripheral for your interests, as it should be primarily a "software file server" issue, but it does touch on (perhaps) some details of the hardware/software interface. The idea of making APL the operating system, and giving it complete and direct control over the system had a considerable appeal. The benefits could be enormous. Speaking of Analogic, are they still around? Do they still have any APL products? And, for that matter, is there a good APL for Intel (Msdos) machines that does not cost the better part of a thousand dollars? Daniel Albano Toronto, Ontario