Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!think.com!linus!linus!linus!dce
From: dce@mitre.org (Dale Earl)
Newsgroups: comp.arch
Subject: RE: IEEE arithmetic
Message-ID: <DCE.91Jun19150139@dvinci.mitre.org>
Date: 19 Jun 91 20:01:39 GMT
Sender: news@linus.mitre.org (News Service)
Distribution: usa
Organization: Mitre, Bedford, MA
Lines: 53
Nntp-Posting-Host: dvinci.mitre.org


 >
 >Henry Spencer wrote (a while back now):
 >
 >My understanding is that the funnier-looking features, in particular the
 >infinities, NaNs, and signed zeros, mostly cost essentially nothing.
 >
 >          I believe this statement is incorrect.  Infs and nans compli-
 >cate the design of the floating point unit.  Even if there is ultimately
 >no performance impact, this added design effort represents a very real
 >cost.  Perhaps some hardware designers could comment.

     With my previous employer I helped design several floating point
units and in my experience the most difficult part of the IEEE
specification to handle were the denormalized numbers.  The
infinities, NANs, signed zeros, and rounding modes do not make the
hardware substantially more complicated or significantly decrease
performance.  As has been pointed out in other related postings these
features can be valuable for many numerical algorithms.  The value vs.
cost for denomalized numbers though is not favorable IMHO.  I feel a
representation for an infinitely small non-zero value would be useful
and much easier to implement in hardware.  But if you have to be able
to say that your device is 100% IEEE floating point compatible you get
stuck implementing the entire standard no matter what the hardware
tradeoffs.

     A related method we also employed on another device that did not
have to be 100% compliant may be of interest.  This device performed
both single precision integer and floating point operations.
Therefore we had a full 32 bit adder and multiplier available.  For
this device we made the on board register file 40 bits wide, 8 bits
for exponent and 32 bits for sign and mantissa.  Without going into
all the details, the integer operands used the 32 bit field and were
processed in a normal manner (exponent field was passed unchanged).
Floating point was performed using the full 32 bit mantissa with
rounding only on output from the chip.  This carried an extra 8 bits
of precision from intermediate operations which remained within the
chip.  The chip also could extract the exponent field into an integer
so we could emulate multiple precision in software if desired.  Not
exactly IEEE standard but probably would work well for many real
applications where better precision is more important than rigorous
numerical error bounding.   

(Please excuse this resend if you have seen it before, distribution problems.)


--

Dale C. Earl                                                dce@mitre.org
These are my opinions, only my opinions, and nothing but my opinions! DCE