Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!olivea!tardis!tymix!uunet!validgh!dgh
From: dgh@validgh.com (David G. Hough on validgh)
Newsgroups: comp.arch
Subject: rounded vs. chopped floating-point arithmetic
Message-ID: <402@validgh.com>
Date: 20 Jun 91 13:50:36 GMT
Organization: validgh, PO Box 20370, San Jose, CA 95160
Lines: 96

Nelson Beebe (beebe@math.utah.edu) recollected the following message
to a colleague:

 -------------------------------------------

  The following little program can be used to illustrate the effect of
  truncating arithmetic has on your larger program:
  
        real dt,t0,t1,t2,tend
        integer n
  
        n = 0
        dt = 0.018
        t0 = 4000.0
        tend = 5000.0
        t1 = t0
        t2 = t0
  
   10   n = n + 1
        t1 = t1 + dt
        t2 = t0 + float(n)*dt
        if (t2 .lt. tend) go to 10
        write (6,*) t1,t2,(t1 - t2)/t2
        end
  
  On the IBM 3090, this single precision version prints:
  
     4879.89844       5000.00781     -0.240218341E-01
  
  That is, the relative error is 2.4%.  On the Sun 4, it produces
  
      5003.70    5000.01    7.37889E-04
  
  The effect of truncating arithmetic on the running sum is large.
  
  The double precision version is:
  
        double precision dt,t0,t1,t2,tend
        integer n
  
        n = 0
        dt = 0.018D+00
        t0 = 4000.0D+00
        tend = 5000.0D+00
        t1 = t0
        t2 = t0
  
   10   n = n + 1
        t1 = t1 + dt
        t2 = t0 + dfloat(n)*dt
        if (t2 .lt. tend) go to 10
        write (6,*) t1,t2,(t1 - t2)/t2
        end
  
  The IBM 3090 result is
  
  5000.00799995563648       5000.00799999999981     -0.887265231637227285E-11
  
  The Sun 4 result is
  
  5000.0080000016    5000.0080000000    3.2341579848518D-13

 -------------------------------------------

Note that satisfactory results are obtained if you use enough precision
or if you round rather than chop.  Also note that this is not the program
that failed, but rather a drastic simplification of the user's actual
application to reveal the essential problem.  It's a simple example where
the superior statistics of rounding rather than chopping imply a broader
domain of applicability for a particular program.

Correct rounding and chopping, and several other good paradigms, can be
characterized by the property

	The rounded computed result is chosen from the two machine-representable
	numbers nearest the unrounded infinitely-precise exact result,
	according to a rule that depends only on the infinitely-precise
	exact result, and not on the operands or operation (or phase of moon). 

Most "fast" sort-of-rounding or sort-of-chopping schemes invented by 
hardware designers eventually frustrate error analysts because they 
can't be so characterized.

As for the first IBM RS/6000 implementation, I have heard that the original
floating-point unit was designed to implement IBM 370 arithmetic, and was
changed to IEEE 754 format relatively late in the game.  If true then it
would not be surprising that some aspects of 754 were problematic to add in.
The interesting question would then be
which aspects of IEEE arithmetic will be really
problematic for a high-performance RS/6000 implementation designed from
scratch to support 754.
-- 

David Hough

dgh@validgh.com		uunet!validgh!dgh	na.hough@na-net.ornl.gov