Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!wuarchive!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.arch Subject: Re: IEEE floating point Message-ID: Date: 30 May 91 14:51:54 GMT References: <9105250030.AA08036@ucbvax.Berkeley.EDU> <1991May25.222551.16365@zoo.toronto.edu> Sender: usenet@ee.udel.edu Organization: College of Marine Studies, U. Del. Lines: 83 Nntp-Posting-Host: perelandra.cms.udel.edu In-reply-to: mccalpin@perelandra.cms.udel.edu's message of 29 May 91 20:46:41 GMT > On 29 May 91 20:46:41 GMT, mccalpin@perelandra.cms.udel.edu I said: Me> RMS ERRORS IN SOLUTION OF LINPACK ORDER 1000 SYSTEM Me> machine precision RMS error Me> --------------------------------------------------------- Me> IBM RS/6000 64-bit 1.3e-10 <-- WRONG! Me> IBM RS/6000 (-qnomaf) 64-bit 2.2e-12 <-- WRONG! Me> IEEE (Sun-3) 64-bit 2.3e-13 Me> --------------------------------------------------------- The mystery is solved, thanks to James Shearer of IBM, who found a bug in one of the BLAS routines I was using. The IBM RS/6000 machines now reproduce the IEEE results and show an insignificant difference in accuracy between the results with the multiply-accumulate instruction and without it. >Date: Wed, 29 May 91 20:30:40 EDT >From: jbs@watson.ibm.com >To: mccalpin >Subject: linpack 1000 > The code you sent me appears to contain a bug. In the isamax >routine there is a statement: > dmax=abs(dx(ix)) >I believe ix should be i. This causes the find pivot step to possibly >find an incorrect pivot. This would explain an increased error in the >result. When I change ix to i, I now get a rms error using the nomaf >option of 2.27*10**-13 (2.24*10**-13 using maf) which agrees with the >other IEEE machines. The results for the 3090 (IBM hex) also change >(7.08*10**-13). It took me longer to find this than it should have >since the VS Fortran compiler was warning me about possibly uninitial- >ized variables (ddot appears to have the same problem although it is >not used). > James B. Shearer The table of results is now: --------------------------------------------------------- Errors in solution of 1000x1000 system of equations from the LINPACK benchmark suite machine precision RMS error --------------------------------------------------------- ETA-10 32-bit 2.2e-01 IBM 3081 32-bit 2.4e-03 VAX 8700 32-bit 3.9e-04 IEEE 32-bit 2.8e-04 ETA-10 64-bit 1.3e-08 Cray X/MP 64-bit 2.5e-11 IBM 3090 64-bit 7.1e-13 IEEE 64-bit 2.3e-13 VAX "D"-format 64-bit 7.2e-14 ETA-10 128-bit 1.6e-22 Cray X/MP 128-bit 4.2e-26 --------------------------------------------------------- John D. McCalpin - mccalpin@perelandra.cms.udel.edu --------------------------------------------------------- Machines which have been tested with IEEE formats include: Sun-3, Sun-4, IRIS 4D, and IBM RS/6000. Curiously, the IRIS 3000 series, which used IEEE formats, but did not use IEEE-compliant rounding got slightly better results (1 bit or so). Note that the error which Shearer found in my code was only for my port of the DOUBLEPRECISION BLAS, so the 32-bit results and the 64-bit results on the machines for which REAL is 64-bits are probably good. In Summary: ----------- The original purpose of the post was to show that the Cyber 205/ETA-10 32-bit format was *much* worse in accuracy than the IBM HEX format, which these results show clearly (for this one test case). The IBM HEX results are only 2-3 bits worse than IEEE, which is seldom disastrous. If this 2-3 bit difference does cause your application serious trouble, then you should be running at the next higher precision on all the platforms.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET