Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!ucbvax!WATSON.IBM.COM!jbs
From: jbs@WATSON.IBM.COM
Newsgroups: comp.arch
Subject: IEEE arithmetic (Goldberg paper)
Message-ID: <9106072346.AA08023@ucbvax.Berkeley.EDU>
Date: 7 Jun 91 23:50:01 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Lines: 52


         Bruce Seiler says:
  The main justification for having all the rounding modes was the +/- infinity
modes made creating an interval arith. package much easier.  As I recall,
without them an interval package could run 300x's slower than straight floating
point.  With them it was ca. 3x's slower.  Besides, these rounding modes were
cheap.

         I don't believe interval arithmetic is used enough to justify
any hardware support.  Providing quad precision (much more useful in
my opinion) would also provide some support of interval arithmetic.  In
any case I believe your estimate of 3x slower than straight floating
point is unrealistic.  Does anyone have any examples of interval arith-
metic packages?  What is their speed compared to straight floating point?
The rounding modes may appear cheap but they have a large hidden cost.
Every intrinsic function subroutine for example must worry about handling
all rounding modes.
         I said:
|>          I believe floating format sizes should be compatible with the
|> basic organization of the machine.  IE if your machine is based on
|> 8-bit bytes the length best be a multiple of 8.  I believe 32 or 64
|> bits are typical widths of data paths in current machines.  Therefore
|> moving 79 bit things around will clearly be wasteful.  Consider a
|> machine with 64 bit floating registers and a 64 bit data path between
|> the floating unit and cache.  I see no efficient way to include a 79
|> bit format.  An 128 bit format on the other hand is easily accomodated.
         Bruce Seiler said:
  The floating point chips from Intel and Motorola have 80 bit floating point
registers.  At the time, the standard developers decided that the main formats
should be a power of two in size so array addressing is fast, it is useful to
have a format with a wider mantissa to have guard bits, and a hardware ALU finds
it convenient to have the mantissa to be a power of two in size.  Putting the
last two points together gives a lower limit of 79 bits.  That is a wierd number
so Intel and Moto. rounded up to 80.  Actually, Moto. stores an extended to
memory in 96 bits so it is long word aligned.  If your compiler has good
register allocation, there will be little extended loads and stores.
  I expect you know this already, but I am trying to understand what you saying
above.  Supporting any format greater than the FP ALU is going to be slow.  What
IS so bad about extended?

         According to "Computer Architecture a Quantitative Approach" (Hennessy
and Patterson, p.A-53, p. E-2) none of the MIPS R3010, Weitek 3364, TI 8847,
Intel i860, Sparc or Motorola M88000 architectures implement IEEE double ex-
tended.  This represents a lot of independent support for my position that
the double extended format is not cost effective.  Apparently these vendors
do not agree that it is convenient for the fraction to be a power of 2 in
length.  If your chip has 80 bit registers I fail to see how you can avoid
performing 80 bit loads and stores while saving and restoring these registers.
My objection to the IEEE double extended format is that it should be 128 bits
wide and called quad precision.  Also I do not agree that it is useful to
have a format with a few extra (guard) bits in the fraction.
                           James B. Shearer