Xref: utzoo comp.lang.c:11754 comp.std.c:254 sci.math:4362
Path: utzoo!attcan!uunet!husc6!mailrus!cornell!uw-beaver!teknowledge-vaxc!sri-unix!quintus!ok
From: ok@quintus.uucp (Richard A. O'Keefe)
Newsgroups: comp.lang.c,comp.std.c,sci.math
Subject: Re: Floating point puzzle
Keywords: floating point representation
Message-ID: <259@quintus.UUCP>
Date: 8 Aug 88 01:16:14 GMT
References: <3117@emory.uucp>
Sender: news@quintus.UUCP
Reply-To: ok@quintus.UUCP (Richard A. O'Keefe)
Organization: Quintus Computer Systems, Inc.
Lines: 28

In article <3117@emory.uucp> riddle@emory.uucp (Larry Riddle) writes:
>Notice that x and y, which have been declared as floats, and thus have
>a 32 bit representation (according to the manual this obeys IEEE
>floating point arithmetic standards), both are printed the same in hex,

This is >>C<< remember?  Floats are 32-bits IN MEMORY, but when you
operate on them or pass them to functions they are automatically
converted to double.

Since you specifically mention the Sun-4, I suggest that you read your
copy of the Sun Floating-Point Programmer's Guide.  In particular, if
you really want to pass 32-bit floating-point numbers in float format,
you will need the "-fsingle2" compiler option.  (I haven't tried this
on a Sun-4, but it works fine on Sun-3s.)

The two flags are

	-fsingle	If the operands of a floating-point operation are
			both 'float', the operation will be done in single
			precision instead of the normal double precision.

	-fsingle2	Float arguments are passed to functions as 32 bits,
			and float results are returned as 32 bits.  (Useful
			for calling Fortran functions.)

TRAP:  floating-point constants such as 1.0 are DOUBLE precision, so if the
compiler sees float x; ... x+1.0, it will do the addition in double
precision.  In such a case, I do	float one = 1.0; ... x+one...