Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84 exptools; site ihuxb.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!mhuxv!mhuxh!mhuxj!mhuxn!ihnp4!ihuxb!wfmans
From: wfmans@ihuxb.UUCP (w. mansfield)
Newsgroups: net.arch
Subject: Re: Floating Point Rounding
Message-ID: <1117@ihuxb.UUCP>
Date: Wed, 24-Jul-85 10:06:59 EDT
Article-I.D.: ihuxb.1117
Posted: Wed Jul 24 10:06:59 1985
Date-Received: Fri, 26-Jul-85 07:38:22 EDT
References: <36900010@ima.UUCP> <1357@peora.UUCP>
Organization: AT&T Bell Laboratories
Lines: 57

> ... (discussion of plain ordinary rounding)
> So far, this is plain, ordinary rounding.  But now comes the problem.  If
> the most-significant guard digit is 8, and all other guard digits are zero,
> you have a dilemma.  If you always round down, the result will come out
> "too small" (we're speaking here in terms of absolute values); if you
> always round up, it will come out "too large".  So... if all other guard
> digits are 0, the least significant bit of the low-order digit of the
> mantissa is FORCED to 1.  The theory here is that there is an equal
> probability that this low-order bit will previously have been a 1 or a zero
> -- something that is much more probable, I believe, than the incorrect
> high-order bit assumption mentioned earlier.  As a result, this R* rounding
> will give a "More Accurate" rounding than the IBM rounding, according
> to this reasoning.

I'd like to point you folks toward the IEEE 754 floating point standard and
its supporting documentation for discussions of rounding.  The IEEE 754
committee spent years (seemingly) discussing rounding, and determines that
Four (4) rounding modes were required to support numerical algorithms in
computers:  Round to nearest, round to plus infinity, round to minus
infinity and round to zero.  Round to nearest is the mode discussed above.
In the case referred to above (in IEEE language, where the two nearest
representable values are equally near the infinitely precise value), the one
with the least significant bit zero will be returned.  In some previous
drafts the IEEE committee wanted to alternate the value returned in this
case; that is, alternately returning the lesser and greater model number.
Although mathmatically pleasing, the hardware oriented members of the
committee suffered apoplexy, and they settled on returning the lesser.

I always thought that the reason for truncation in IBM was because the
FORTRAN language was defined that way.  Clearly difficult to find cause
and effect here.
> 
> However, very sadly, in the real world, "better" is not always good enough.
> Since IBM does it another way, the less-accurate results have come to be so
> ingrained in the scientific community that more accurate results often
> cause an uproar: "you didn't get the results our IBM got, so your machine
> is broken!" they exclaim.  As a result, R* rounding is not that popular,
> eventhough there is a large body of theory behind it showing that it is
> a "better" rounding scheme than the IBM approach.
> 
A big part of the problem is that the real numerical analysists designing
programs for computers are very aware of the peculiarities of the hardware
that their programs run on, and they have very carefully designed their
software to extract as much precision as possible from the hardware, and
in doing so have made their programs hardware specific.
The purpose of the IEEE standard is to provide a machine independent system
of arithmetic, where the same algorithm will give the same result regardless
of the underlying hardware implementation.

Another part of the problem is that the folks who write specifications for
computer languages (Ada excluded) don't specify the type of arithmetic
to be performed.  Basically they wimp out and say "the answer you get
depends on the hardware".  This is especially true where floating point
is concerned.

(The preceding is my opinion -- I do not represent my employer or IEEE in
presenting these views)