Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10 5/3/83; site ssc-vax.UUCP
Path: utzoo!linus!decvax!harpo!gummo!whuxlb!floyd!vax135!cornell!uw-beaver!ssc-vax!sts
From: sts@ssc-vax.UUCP (Stanley T Shebs)
Newsgroups: net.ai
Subject: Re: Matrix Multiplication on the FFP Machine
Message-ID: <406@ssc-vax.UUCP>
Date: Thu, 11-Aug-83 16:23:19 EDT
Article-I.D.: ssc-vax.406
Posted: Thu Aug 11 16:23:19 1983
Date-Received: Fri, 12-Aug-83 17:38:02 EDT
References: <5687@unc.UUCP>
Organization: Boeing Aerospace, Seattle
Lines: 30

I must admit to being a little sloppy when giving the maximum speed
of a matrix multiplication on an FFP machine (haven't worked on this
stuff for a year, and my memory is slipping).  I still stand by the
original statement, however.  The *maximum* possible speed for the 
multiplication of two nxn matrices is O(log n).  What I should have
done is state that the machine architecture is completely unspecified.
I am not convinced that the Mago tree machine is the ultimate in
FFP designs, although it is very interesting.  The achievement of
O(log n) requires several things.  Let me enumerate.  First, assume
that the matrix elements are already distributed to their processors.
Second, assume that a single processor can quickly distribute a value to
arbitrarily many processors (easy: put it on the bus (buss? :-} )
and let the processors all go through a read cycle simultaneously).
Third, assume that the processors can communicate in such a way
that addition of n numbers can be performed in log n time (by adding
pairs, then pairs of pairs, etc).  Then the distribution of values
takes constant time, the multiplications are all done simultaneously
and so take constant time, leaving only the summation to slow things
down.  I know this is fast and loose; its main failing is that it
assumes the availability of an extraordinarily high number of
communication paths (the exact number is left as an exercise for the
reader).

					stan the leprechaun hacker
					ssc-vax!sts (soon utah-cs)

ps For those not familiar with FP, read J. Backus' Turing Lecture
in CACM (Aug 78, I believe) - it is very readable, also he gives
a one-liner for matrix multiplication in FP, which I used as a basis
for the timing hackery above