Xref: utzoo comp.arch:14424 comp.lang.lisp:2868 comp.lang.misc:4341 comp.lang.smalltalk:1736
Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!decwrl!shelby!neon!carcoar!wilson
From: wilson@carcoar.Stanford.EDU (Paul Wilson)
Newsgroups: comp.arch,comp.lang.lisp,comp.lang.misc,comp.lang.smalltalk
Subject: CORRECTION to: heaps of numbers (tagged immediate floats)
Summary: steal 2 or 3, not 4 bits, because exponentiation is double
Keywords: floating point, garbage collection, tags, IEEE floats, whoops
Message-ID: <1990Mar6.015230.20068@Neon.Stanford.EDU>
Date: 6 Mar 90 01:52:30 GMT
References: <1990Mar5.220724.3718@Neon.Stanford.EDU> <14261@lambda.UUCP>
Sender: root@Neon.Stanford.EDU (System PRIVILEGED Account)
Reply-To: wilson@carcoar.Stanford.EDU (Paul Wilson)
Organization: U. of Illinois at Chicago (UIC, *not* UofC or UIUC)
Lines: 38

Thanks to Carl Lowenstein, Jim Giles and Herman Rubin for pointing out
my misunderstanding of the IEEE float format.  I had not realized that
the exponent is really interpreted with a double exponentiation -- that
changes things a bit.  (Or maybe two bits.)

(I seem to recall that the only floating point format I ever had to learn
used a power of a fixed number, and I thought IEEE would be the same.)

So it looks like each marginal bit would be more important than I thought,
favoring stealing _very_few_ bits from the exponent. 2**(2**4) would
only give you a range of 1/32K to 32K.  But 2**(2**5) would give you
a range from 1/2nano- to 2giga- which seems pretty reasonable.  And 2**(2**6)
goes from pretty seriously small to pretty seriously big (by my standards).

I interpret this to mean that I can steal 2 bits from the 8-bit exponent, 
leaving 6, or maybe 3, leaving 5.

It would be awkward to use less than two bits for primary tags, given that
you probably want separate tags for pointers, immediate ints, and other
immediates.  So it looks like the question comes down to this:  is the
1/2nano- to 2giga- range enough for the large majority of floats that
get stored into memory, or should I go with the less convenient scheme 
of only stealing 2 bits?  In the latter case I'd have to use up one of our 
four primary (2-bit low-) tags, but we could live with it.

Any empirical info relevant to this tradeoff would be greatly appreciated.

And if I've got it wrong somehow, please point it out.  (As I see it, my
real assumption is this:  the upper 2 or 3 non-sign bits of an IEEE short
are usually zero.  If I've misunderstood FP representation or
distributions, and this is not true, let me know.)

   -- Paul

Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@bert.eecs.uic.edu
Box 4348   Chicago,IL 60680