Xref: utzoo comp.lang.c:11754 comp.std.c:254 sci.math:4362 Path: utzoo!attcan!uunet!husc6!mailrus!cornell!uw-beaver!teknowledge-vaxc!sri-unix!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.lang.c,comp.std.c,sci.math Subject: Re: Floating point puzzle Keywords: floating point representation Message-ID: <259@quintus.UUCP> Date: 8 Aug 88 01:16:14 GMT References: <3117@emory.uucp> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 28 In article <3117@emory.uucp> riddle@emory.uucp (Larry Riddle) writes: >Notice that x and y, which have been declared as floats, and thus have >a 32 bit representation (according to the manual this obeys IEEE >floating point arithmetic standards), both are printed the same in hex, This is >>C<< remember? Floats are 32-bits IN MEMORY, but when you operate on them or pass them to functions they are automatically converted to double. Since you specifically mention the Sun-4, I suggest that you read your copy of the Sun Floating-Point Programmer's Guide. In particular, if you really want to pass 32-bit floating-point numbers in float format, you will need the "-fsingle2" compiler option. (I haven't tried this on a Sun-4, but it works fine on Sun-3s.) The two flags are -fsingle If the operands of a floating-point operation are both 'float', the operation will be done in single precision instead of the normal double precision. -fsingle2 Float arguments are passed to functions as 32 bits, and float results are returned as 32 bits. (Useful for calling Fortran functions.) TRAP: floating-point constants such as 1.0 are DOUBLE precision, so if the compiler sees float x; ... x+1.0, it will do the addition in double precision. In such a case, I do float one = 1.0; ... x+one...