Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 (Tek) 9/26/83; site orca.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!decvax!tektronix!orca!graham
From: graham@orca.UUCP (Graham Bromley)
Newsgroups: net.lang.c
Subject: ANSI C Standard: Float/Double Handling
Message-ID: <1152@orca.UUCP>
Date: Fri, 9-Nov-84 14:34:35 EST
Article-I.D.: orca.1152
Posted: Fri Nov  9 14:34:35 1984
Date-Received: Sat, 10-Nov-84 10:24:39 EST
Organization: Tektronix, Wilsonville OR
Lines: 55


	Having read some comments on float/double constant and
variable handling, I'd like to say (quite) a few words on the
subject.

	Recently I modified float/double handling in a 4.1 C
compiler. The goal was to seperate float and double into two 
completely distinct data types, for BOTH variables and constants.
Float -> double promotion in function arguments was also addressed.
The following scheme was used.

	* A numeric constant containing an exponent identifier 'e' 
		(e.g. 1.3e03) is of type float, regardless of the number 
		of significant digits. Excess significant digits are discarded.
	* A numeric constant containing an exponent identifer 'd' 
		(e.g. 1.3d03) is of type double, regardless of the number 
		of significant digits.
	* A constant containing a '.' but no 'e' or 'd' is of type
		float or double, as dictated by the number of significant
		digits and the host machine float representation.
	* In expressions of the type
				float OP float
		there is no float->double conversion. But there is in
		expressions of the type
				float OP double
		of course.
	* When a function is called, a float argument is not converted
		to double.

	The standard math library was also modified to include both
float and double versions of each function, e.g. float sin() and
double dsin(), float exp() and double dexp() etc. It took about
a week to do everything.

	Simple benchmarks showed a performance increase of about 2.0:1
for floating point arithmetic bound processes which used only
float operations (no double). Because a float uses only one
register on the VAX and a double needs two, these changes also
allowed more float arithmetic to be done in registers - in
particular, almost all of the arithmetic in the float math library
functions.

	It seems to be a poor idea to make all float constants
double, because a single constant in an expresion will promote
all the way through the expression, causing all those unwanted
conversions.

	Anyone considering this type of thing should beware of side
effects, e.g printf will treat a float argument as double (it
will remove 8 bytes from the stack), as will anything else which
hasn't been compiled with a modified compiler. One way to solve 
this is to cast all float arguments to standard functions with
(double), if you don't want to recompile all standard code with 
your modified compiler. This is harmless on an
unmodified compiler.