Path: utzoo!attcan!ncrcan!ziebmef!daniel
From: daniel@ziebmef.mef.org (Daniel Albano)
Newsgroups: comp.lang.apl
Subject: Re: APL Machines
Message-ID: <1989Sep19.111751.7613@ziebmef.mef.org>
Date: 19 Sep 89 15:17:49 GMT
References: <153557@<1989Sep5> <49700014@uicsrd.csrd.uiuc.edu> <22186@cup.portal.com>
Reply-To: daniel@ziebmef.mef.org (Daniel Albano)
Organization: Ziebmef Public Access Unix, Toronto, Ontario
Lines: 123

The question of "hardware optimization" for APL machines 
requires some consideration of how they will be used.  Are
we speaking here of single user or multi-user machines?
Are we looking at multiple user tasks, or single-workspace
machines with synchronous processing (single task/workspace)?
Is the machine going to run "APLish" code (data embedded
in arrays, large "logically parallel" operations); or will 
it be used for "third generation language" processing style
as tends to predominate in COBOL, dBase, and PASCAL application
(mixed data type stored as "records" or in structures accessed
through index operations)?
 
Choices in how it will be used - and who is going to be 
footing the bill :-)  will largely determine design choices.
Exxon crunching seismic data will have a much different view
of the "optimal machine" than I will; probably by a few million
dollars.
 
I tend to discount the overheads in handling small (several 
element to scalar) data structures, because it that is all
there is, then the application is unlikely to be large, and
there should be lots of spare machine cycles.  When the work
grows, if the application is well designed, so does the size
of your data entities.
 
Making APL hardware could involve implementing a lot of APL
as "machine instructions", but that really means microcode -
programs written in a constrained environment to squeeze as
much as possible out of very elementary operations.  Given the
raw hardware capabilities (single cycle instruction execution,
pipelineing, branch lookahead, etc.) the idea of building 
what could be termed an ECISC machine (Exceedingly Complicated
Instruction Set Computing - and I do mean complicated, not
complex) flies in the face of current conventional wisdom
on performance.  Of course, this would not be the first time
that conventional wisdom could be wrong.  The structuring of
the execution environment by the APL interpreter would be an 
advantage here - but it is also an advantage in generating 
highly tuned object code as well - and perhaps an assembler
would be easier to work with than a micro-assembler.  As is 
often the case, a middle ground may work well.  Perhaps a 
more or less conventional CISC (or RISC?) machine with a few
well chosen enhancements to the instruction set to improve
fundamental APL operations would offer advantages without too
much sacrifice of development effort and system complexity.
 
The nature of the enhancements depend very much on what you
consider the key APL level operations to be.  My own list of 
crucial favourites would include reshape, dyadic iota, 
compress, reduce, and generalized inner products (or at least
and.equals).  Operations on Boolean arrays would also be very 
high on the list, as would comparisons, especially those for
integer and character data.  Both base value and representation
are also important from a performance view in that they should
not be overly slow, but good system design can keep their use 
far below that of the previously mentioned operations.  One 
operation implicit in any APL that is crucial is the allocation
of storage for data entities.  Function calling, including the
creation of local symbols for user defined functions is another
thing that can happen very often in a well-structured system.
 
On the other hand, I don't really care that much how fast 
the trig functions or domino run, nor that much how fast 
exponentiation and the like are.  Someone who does more 
"calculation" rather than "byte bashing" (commercial/logical
systems) would again have a different set of priorities.  If
I were heavily into graphics in a particular instance I might
care about floating point, but in most systems I have seen,
comparisons and data manipulation far outweigh calculation.
 
The most crucial item in a lot of cases is just workspace size.
An application that can get most or all of its data in a 
workspace and have enough left over for transient storage for
intermediate results will be not only faster, but easier to 
write and to maintain.  I have used workspaces from 32K to 
several megabytes, and would not willingly go back to the 
small ones.  Of course, my home system has a 750K workspace,
so most of my recent code has been very easy - I just don't
seem to have too much data in my life :-) ... of course the 
data compression / storage reduction techniques learned in
those 32K workspaces still do a great job in maximizing the
capabilities of something more than twenty times larger.
 
Personally, I don't thing (single user machine) that you have
to provide multiple "control streams" or simultaneous processing
of user level tasks, but some parallelism in the execution of
array operations - or a high speed vector processing scheme -
could provide a major boost in performance.  In many operations
on conformal arrays, the (logical) structure of the data is
not important during the actual computation step.  If you could
generalize (should be easy) for scalar/array operations, then
you have a large part of the problem in hand.  Many other 
operations could be decomposed into a number of vector, or
"quasi-vector" operations.  By quasi-vector, I mean that there
are a number of elements, regularly and repetitively spaced 
through the machine's memory, with (temporarily) irrelevant
elements interspersed. 
 
In real world terms, another major performance boost is a true
native APL file system - one that stores the components in their
internal representaion.  I did work with one system that only
stored what were essentially character representations of the 
data (two dimensional? memory is mercifully weak on the details)
and a lot of time was spent converting to and from "file format".
This is probably too peripheral for your interests, as it should
be primarily a "software file server" issue, but it does touch
on (perhaps) some details of the hardware/software interface.
 
The idea of making APL the operating system, and giving it 
complete and direct control over the system had a considerable
appeal.  The benefits could be enormous.
 
 

Speaking of Analogic, are they still around?  Do they still have
any APL products?  And, for that matter, is there a good APL for
Intel (Msdos) machines that does not cost the better part of a 
thousand dollars?

	

Daniel Albano
Toronto, Ontario