Xref: utzoo comp.sys.super:379 comp.arch:23118 comp.parallel:2631
Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!think.com!zaphod.mps.ohio-state.edu!wuarchive!emory!hubcap!usenet
From: dodson@convex.COM (Dave Dodson)
Newsgroups: comp.sys.super,comp.arch,comp.parallel
Subject: Re: Massively Parallel LINPACK on the Intel Touchstone Delta machine
Message-ID: <1991Jun6.174129.25202@hubcap.clemson.edu>
Date: 6 Jun 91 16:30:31 GMT
References: <1991Jun3.233741.8570@elroy.jpl.nasa.gov> <13301@pt.cs.cmu.edu> <ELIAS.91Jun6090922@wonton.TC.Cornell.EDU>
Sender: usenet@convex.com (news access account)
Reply-To: dodson@convex.COM (Dave Dodson)
Organization: CONVEX Computer Corporation, Richardson, Tx., USA
Lines: 37
Approved: parallel@hubcap.clemson.edu
Apparently-To: hypercube@hubcap.clemson.edu
Nntp-Posting-Host: mozart.convex.com

In article <ELIAS.91Jun6090922@wonton.TC.Cornell.EDU> elias@wonton.TC.Cornell.EDU (Doug Elias) writes:
>i'd appreciate some rationale on
>   "FLOPS is defined as  (2/3 N^3 + 2 N^2) / elapsed-time",
>where, i assume, N == number of processors used in the computation.

No.  N is the order of the system of equations.  The expression in
parentheses approximately represents the operation count for Gauss
Elimination using ordinary vector and matrix operations.

What is interesting about this is that there are algorithms based on
"fast" matrix multiplication, where the product of two K by K matrices
can be formed with fewer than O(K^3) floating point operations.  If you
use one of these fast algorithms, you may do significantly fewer than
(2/3 N^3 + 2 N^2) floating point operations, but you get credit for
(2/3 N^3 + 2 N^2) operations anyway.  The numerical stability of
these fast methods is questionable, especially when the equations
and unknowns are poorly scaled.  Therefore, a benchmark report on a
solver should state if a fast matrix multiplication algorithm is used
so the reader can draw his own conclusions regarding the solver's
applicability to his problem.

Another factor that must be considered is the type of pivoting used.
As pivoting can involve much data motion, omitting it is desirable
from the performance point of view.  However, the resulting code may
divide by zero or, even worse, produce severely inaccurate answers
with no warning.  Except for matrices that are known to have special
properties such as positive definiteness or strict diagonal dominance,
at least partial pivoting is required to insure numerical stability.
I haven't read the Dongarra report to know whether it specifies that
pivoting is required, but if it doesn't specify, then a complete
report on a solver would describe the type of pivoting used.  I don't
recall that information in the Touchstone Delta report.

----------------------------------------------------------------------

Dave Dodson		                             dodson@convex.COM
Convex Computer Corporation      Richardson, Texas      (214) 497-4234

-- 
=========================== MODERATOR ==============================
Steve Stevenson                            {steve,fpst}@hubcap.clemson.edu
Department of Computer Science,            comp.parallel
Clemson University, Clemson, SC 29634-1906 (803)656-5880.mabell