Path: utzoo!utgpu!watserv1!watmath!att!tut.cis.ohio-state.edu!cs.utexas.edu!uunet!convex!garzione
From: garzione@convex.com (Michael Garzione)
Newsgroups: comp.lang.c
Subject: Re: Inherent imprecision of floating point variables
Message-ID: <103379@convex.convex.com>
Date: 26 Jun 90 12:53:38 GMT
References: <3300@crash.cts.com>
Sender: usenet@convex.com
Lines: 50

rond@pro-grouch.cts.com (Ron Dippold) writes:

>In-Reply-To: message from pariyana@hawk.ulowell.edu

>> I have a question regarding floating point variables.  I have a
>> program that requires exact precision in all computations.
>> Floating point variables have given me trouble.  For example, when
>> I enter 32.14 into a floating point variable, then print it out, it
>> prints 32.139998.  This problem is magnified in all computations.
> 
>Check out the types that your compiler supports.  Many support a "double
>float" or something similar with much more precision.  Some also support IEEE
>floating-point numbers, which use 80 bits.


>UUCP: crash!pro-grouch!rond
>ARPA: crash!pro-grouch!rond@nosc.mil
>INET: rond@pro-grouch.cts.com

You will never get "exact" precision with any floating point computation
if any of the numbers used cannot be represented in base 2 exactly (assuming
the exponent in the floating point number is represented as a power of 2 and
the mantissa is also in base 2 and of finite length)

The error you see above has three sources.  
(1) 32.14 (base 10) is not an exact power of 2, so does not have an exact
    representation as a floating point number
(2) If you say something like "float i = 32.14" you get another potential
    error (in this case it's just a result of #1) as the input routines
    convert a base 10 _printed_ representation of a number to its binary
    equivalent (base 2 mantissa and exponent)
(3) Now that you've got an approximation of 32.14, printing it out gets
    still another potential error converting binary representations to
    base 10 to print.

(From Knuth, Vol II, p. 200 : "Note:  Since floating point arithmetic is
inherently approximate...")

If you _really_ need *exact* precision you may want to consider using
a representation where non-integral numbers are represented as the ratio
of two integral numbers and use multiple precision arithmetic on them.
You'll be guaranteed of exact precision in all calculations, but it's
probably not worth the hassle.

Hope this helps.

Mike
Convex Computer Corporation                            
{uunet,sun}!convex!garzione
garzione@convex.com