Path: utzoo!mnetor!geac!torsqnt!tmsoft!robohack!eci386!clewis From: clewis@eci386.uucp (Chris Lewis) Newsgroups: can.usrgroup Subject: Re: C code (fwd) Message-ID: <1990Mar1.165859.29471@eci386.uucp> Date: 1 Mar 90 16:58:59 GMT References: <9002280838.AA01415@cohort.uucp> Reply-To: clewis@eci386.UUCP (Chris Lewis) Distribution: ont Organization: Elegant Communications Inc., Toronto, Canada Lines: 51 In article <9002280838.AA01415@cohort.uucp> Steve Bird writes: | float a,b; | b = 2.0e20 + 1.0; | a = b - 2.0e20; | printf("%f \n",a); | | When compiled the program returns the number 4008175468544.000000 . | Now when the program is modified to read : | | float a,b; | b = 2.0e20 + 1.0; | a = 2.0e20; | printf("%f \n",b - a); | | The program returns 0.000000 . Why ? Actually, it's truncation error ;-) The numbers printed in the above cases frequently depend upon the precise C compiler and processor you're running on. The explanation I give is "traditional" C, according to K&R (ANSI C provides for different behaviour under certain circumstances): When a float is passed to a function or used in an expression, the operand is first coerced to a double. Eg: the subtraction in the first fragment has both arguments coerced to double, and then the result is forced into a float. Since floats are usually half the size of doubles, you lose digits off the least significant end. In the second fragment, the subtraction is also done with both as doubles, but since it is being used as a function argument, it is not truncated into a float, and it's passed as a double to printf's %f handler. There are other factors coming into play - depending on your machine, 2e20 + 1 may actually *equal* 2e20, depending upon how many digits of precision the variable that the result is stored in has. (Which is what I suspect that your second example is trying to tell you) Frankly, given that "traditional" C does all floating point operations and argument passing in doubles, I almost never use a float to store the result of an FP operation, and only use floats in large arrays. If you use floats for the results of FP operations, the algorithm should be well understood as to the magnitudes of the operands used. This sort of thing is still possible with doubles, but you can get away with more. If you make everything double, chances are it'll be faster (less coercing required), and only use significant amounts of space in large arrays. -- Chris Lewis, Elegant Communications Inc, {uunet!attcan,utzoo}!lsuc!eci386!clewis Ferret mailing list: eci386!ferret-list, psroff mailing list: eci386!psroff-list