Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!mcsun!hp4nl!charon!dik
From: dik@cwi.nl (Dik T. Winter)
Newsgroups: comp.arch
Subject: Re: RS6000 Multiply/Accumulate instruction
Keywords: RS6000, floating-point multiply, fp add
Message-ID: <8907@boring.cwi.nl>
Date: 21 Mar 90 00:31:23 GMT
References: <3060@wtkatz.oakhill.UUCP> <5827@udccvax1.acs.udel.EDU> <5854@udccvax1.acs.udel.EDU> <8407@pt.cs.cmu.edu> <5733@brazos.Rice.edu> <5858@udccvax1.acs.udel.EDU> <8888@boring.cwi.nl> <5863@udccvax1.acs.udel.EDU>
Sender: news@cwi.nl (The Daily Dross)
Distribution: comp
Organization: CWI, Amsterdam
Lines: 47

About:
    Errors in solution of the 1000x1000 system of equations
machine			precision	RMS error
---------------------------------------------------------
IBM RS/6000             64-bit          1.2e-12  <-- new ibm
IEEE (Sun-3)		64-bit		2.3e-13  <-- ieee
 
I asked how this was generated; John McCalpin answers:
 > The RMS error that I have been posting is based on the fact that the
 > matrix is chosen in such as way that the solution consists of 1000
 > identical elements equal to 1.0d0.  It does this by generating a 
 > pseudo-random matrix and then re-scaling by the row sums (I think this
 > is how it was done).
(I assume this is from a uniform distribution on [0.0,1.0]; but that does
not matter very much.  Also I do assume the right-hand side is random.)
 >                       There are therefore two possible sources of error:
 > --> The first is that the matrix may not have been scaled correctly because
 > of roundoff error in the calculation of the row sums.
Right, but if we use backward error analysis that does not matter.
 > --> The second is that the solution contains propagated roundoff error from
 > the LU-decomposition and back-substitution steps.
Agreed.
 > 
 > I have consistently ignored the first source of error, and defined the
 > RMS error to be:
Follows the definition.

The problem I see is that to compare arithmetic properties of processors
you must use *identical* input data on which to perform the operations.
When using random numbers from the system supplied random number generator
you cannot be sure of this (and in general it is not true).  In the case
of solving a set of linear equations that can change a lot.  I just did
some tests with a 100*100 system on our Alliant FX/4 (near IEEE arithmetic).
I used LINPACKs DGEFA (for decomposition) and DGESL (for solving).  Even
when I used the same system, but only reordened the right-hand side
the errors ranged from 4.3e-14 to 1.2e-13.  The next system I tried (also
random) gave errors ranging from 8.3e-14 to 1.4e-13.  In all cases these
errors are in the range to be expected.  (The numbers are better than
those given above because they scale with the order of the matrix.)

So my conclusion still is that the difference in errors between the IBM and
Sun/IEEE implementations are within the noise.  It may be there is indeed
some flaw in the IBM implementation, but that requires more convincing
figures.
-- 
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl