Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84 exptools; site ihuxb.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!mhuxv!mhuxh!mhuxj!mhuxn!ihnp4!ihuxb!wfmans From: wfmans@ihuxb.UUCP (w. mansfield) Newsgroups: net.arch Subject: Re: Floating Point Rounding Message-ID: <1117@ihuxb.UUCP> Date: Wed, 24-Jul-85 10:06:59 EDT Article-I.D.: ihuxb.1117 Posted: Wed Jul 24 10:06:59 1985 Date-Received: Fri, 26-Jul-85 07:38:22 EDT References: <36900010@ima.UUCP> <1357@peora.UUCP> Organization: AT&T Bell Laboratories Lines: 57 > ... (discussion of plain ordinary rounding) > So far, this is plain, ordinary rounding. But now comes the problem. If > the most-significant guard digit is 8, and all other guard digits are zero, > you have a dilemma. If you always round down, the result will come out > "too small" (we're speaking here in terms of absolute values); if you > always round up, it will come out "too large". So... if all other guard > digits are 0, the least significant bit of the low-order digit of the > mantissa is FORCED to 1. The theory here is that there is an equal > probability that this low-order bit will previously have been a 1 or a zero > -- something that is much more probable, I believe, than the incorrect > high-order bit assumption mentioned earlier. As a result, this R* rounding > will give a "More Accurate" rounding than the IBM rounding, according > to this reasoning. I'd like to point you folks toward the IEEE 754 floating point standard and its supporting documentation for discussions of rounding. The IEEE 754 committee spent years (seemingly) discussing rounding, and determines that Four (4) rounding modes were required to support numerical algorithms in computers: Round to nearest, round to plus infinity, round to minus infinity and round to zero. Round to nearest is the mode discussed above. In the case referred to above (in IEEE language, where the two nearest representable values are equally near the infinitely precise value), the one with the least significant bit zero will be returned. In some previous drafts the IEEE committee wanted to alternate the value returned in this case; that is, alternately returning the lesser and greater model number. Although mathmatically pleasing, the hardware oriented members of the committee suffered apoplexy, and they settled on returning the lesser. I always thought that the reason for truncation in IBM was because the FORTRAN language was defined that way. Clearly difficult to find cause and effect here. > > However, very sadly, in the real world, "better" is not always good enough. > Since IBM does it another way, the less-accurate results have come to be so > ingrained in the scientific community that more accurate results often > cause an uproar: "you didn't get the results our IBM got, so your machine > is broken!" they exclaim. As a result, R* rounding is not that popular, > eventhough there is a large body of theory behind it showing that it is > a "better" rounding scheme than the IBM approach. > A big part of the problem is that the real numerical analysists designing programs for computers are very aware of the peculiarities of the hardware that their programs run on, and they have very carefully designed their software to extract as much precision as possible from the hardware, and in doing so have made their programs hardware specific. The purpose of the IEEE standard is to provide a machine independent system of arithmetic, where the same algorithm will give the same result regardless of the underlying hardware implementation. Another part of the problem is that the folks who write specifications for computer languages (Ada excluded) don't specify the type of arithmetic to be performed. Basically they wimp out and say "the answer you get depends on the hardware". This is especially true where floating point is concerned. (The preceding is my opinion -- I do not represent my employer or IEEE in presenting these views)