Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10 5/3/83; site ssc-vax.UUCP Path: utzoo!linus!decvax!harpo!gummo!whuxlb!floyd!vax135!cornell!uw-beaver!ssc-vax!sts From: sts@ssc-vax.UUCP (Stanley T Shebs) Newsgroups: net.ai Subject: Re: Matrix Multiplication on the FFP Machine Message-ID: <406@ssc-vax.UUCP> Date: Thu, 11-Aug-83 16:23:19 EDT Article-I.D.: ssc-vax.406 Posted: Thu Aug 11 16:23:19 1983 Date-Received: Fri, 12-Aug-83 17:38:02 EDT References: <5687@unc.UUCP> Organization: Boeing Aerospace, Seattle Lines: 30 I must admit to being a little sloppy when giving the maximum speed of a matrix multiplication on an FFP machine (haven't worked on this stuff for a year, and my memory is slipping). I still stand by the original statement, however. The *maximum* possible speed for the multiplication of two nxn matrices is O(log n). What I should have done is state that the machine architecture is completely unspecified. I am not convinced that the Mago tree machine is the ultimate in FFP designs, although it is very interesting. The achievement of O(log n) requires several things. Let me enumerate. First, assume that the matrix elements are already distributed to their processors. Second, assume that a single processor can quickly distribute a value to arbitrarily many processors (easy: put it on the bus (buss? :-} ) and let the processors all go through a read cycle simultaneously). Third, assume that the processors can communicate in such a way that addition of n numbers can be performed in log n time (by adding pairs, then pairs of pairs, etc). Then the distribution of values takes constant time, the multiplications are all done simultaneously and so take constant time, leaving only the summation to slow things down. I know this is fast and loose; its main failing is that it assumes the availability of an extraordinarily high number of communication paths (the exact number is left as an exercise for the reader). stan the leprechaun hacker ssc-vax!sts (soon utah-cs) ps For those not familiar with FP, read J. Backus' Turing Lecture in CACM (Aug 78, I believe) - it is very readable, also he gives a one-liner for matrix multiplication in FP, which I used as a basis for the timing hackery above