Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!att!dptg!ulysses!andante!alice!ark
From: ark@alice.UUCP (Andrew Koenig)
Newsgroups: comp.lang.c
Subject: Re: Inherent imprecision of floating point variables
Message-ID: <11050@alice.UUCP>
Date: 15 Jul 90 16:34:49 GMT
References: <11035@alice.UUCP> <7913@ncar.ucar.edu>
Organization: AT&T Bell Laboratories, Liberty Corner NJ
Lines: 35

In article <7913@ncar.ucar.edu>, steve@groucho.ucar.edu (Steve Emmerson) writes:
> In <11035@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes:

> >Granted by the IEEE floating-point standard, for one thing.
> >If I am using a system whose vendor claims that it supports
> >IEEE floating point, then I can expect that

> >	input conversion on a floating point number with an
> >	exact representation will be exact;

> An earlier discussion of the IEEE standard indicated that it allows an
> exactly-representable value to be off by by one bit.

> Is this, then, incorrect?

It depends on what you mean by `off by one bit.'

Every decimal floating-point literal (such as 0.3 or 2.7e-28) is the exact
representation of some rational number.  When trying to fit that number
into an IEEE floating-point representation, some accuracy may have to be
lost by rounding.  The IEEE standard allows that rounding error, of course,
and 0.47 times the value of the low-order bit in additional error.
Among other things, the number 0.47 implies that

	some numbers will have the last bit wrong when converted, but

	if the input exactly represents an IEEE floating-point value,
	then that is the value you will get.  Moreover,

	if you write an IEEE value out with enough significant digits
	and read it back in again, the accumulated error will never be
	enough to change the value.
-- 
				--Andrew Koenig
				  ark@europa.att.com