Path: utzoo!attcan!uunet!munnari.oz.au!goanna!ok From: ok@goanna.oz.au (Richard O'keefe) Newsgroups: comp.arch Subject: Re: floating point formats -- usage of small floats Keywords: floating point formats, floating point usage, Lisp, heaps Message-ID: <2944@goanna.oz.au> Date: 5 Mar 90 21:11:06 GMT References: <3880@uceng.UC.EDU> <1990Mar5.031003.12107@Neon.Stanford.EDU> Organization: Comp Sci, RMIT, Melbourne, Australia Lines: 62 In article <1990Mar5.031003.12107@Neon.Stanford.EDU>, wilson@carcoar.Stanford.EDU (Paul Wilson) writes: > [ wants to implement short-floats; thinks it would be cool to take bits > out of the exponent; wants to know how many ] > I figure that I can just lop a couple of bits of of the exponent part, and > leave the fraction part alone. That way, I sacrifice range of floats > but not precision. The snag with that is that you sacrifice *all* the precision at the ends of your ranges. The IEEE-754 formats are very carefully balanced; there are actually rules that tell you what would be good sizes for the fields. Obviously, a short-float cannot be IEEE-754, but you *could* try to see how many of the constraints listed in IEEE-854 you can satisfy. To give you some idea of the kind of thinking that is involved: suppose you have N effective mantissa bits (counting the hidden bit). You would like nextafter(1.0,/*in the direction of*/ 2.0) - 1.0 (that is, the next number just bigger than 1.0, minus 1.0) to be representable; that's your epsilon. So you want 2**(1-N) to be representable, and it's a general rule that the exponent range should be roughly symmetric, so you want the range to be at least 1-N..N-1 which means you need approx ceil(log2(N)+1) exponent bits. For N=24, that means an absolute minimum of 6 exponent bits. More than that, you'd really like an epsilon's worth of epsilon, so you want at least ceil(log2(N)+2) exponent bits, and now we're just 1 short of 754'8 8 exponent bits. It is much better to steal bits out of the significand field. There are several Prolog systems that do it; however that approach is losing its popularity. It was ok when you were just entering co-ordinates for graphics, but for serious calculation anything less than the usual 32 bits is unsafe in unskilled hands. If you do decide to steal bits out of the significand field, make a conscious design decision about whether you want to preserve the "graceful underflow" property of the IEEE standards or not. If you do, you'd have quite a reasonable system, but the conversions are not as simple. If you don't, the conversions are simpler, but the resulting arithmetic system isn't much like IEEE. That's what most people have talked about. But it's *not* what Paul Wilson *asked* about. His proposal is: > If a float result goes out of that range, I'll make > a full-sized floating point object on the heap, and use a tagged > pointer to it. This is a really neat idea. I like it a LOT. He's not actually proposing to change the properties of the arithmetic system at all, merely the representation in memory. (Rather like the VAX tiny-float immediate operands.) Presumably DEC picked the range of their tiny-floats based on some expectation of the range of constants in programs; if your coding covers that, you're doing at least as well as the VAX. A suggestion: why not scan through CALGO (ACM Collected Algorithms) and issues of JRSS Series C and such things to see what some typical ranges are like. I've had ranges of 10**9 or more in my code, but so what?