Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!ucbvax!WATSON.IBM.COM!jbs
From: jbs@WATSON.IBM.COM
Newsgroups: comp.arch
Subject: IEEE arithmetic
Message-ID: <9106150258.AA16308@ucbvax.Berkeley.EDU>
Date: 15 Jun 91 02:33:45 GMT
Sender: daemon@ucbvax.BERKELEY.EDU
Lines: 63


          Dik Winter said (the issue is whether 3x is a realistic
estimate of how much slower interval arithmetic will be compared to
straight floating point):
 > How come unrealistic?  Example, interval add a to b giving c:
 >         set round to - inf;
 >         c.low = a.low + b.low;
 >         set round to + inf;
 >         c.upp = a.upp + b.upp;
 > why would this be more than 3x slower than straight floating point?
          I said:
 >         Because it's 4 instructions vrs 1.
          Dik Winter:
Thus it would be 4 times as slow if each instruction takes exactly the same
time to issue.  This is not true on processors where the FPU is not pipelined
(in that case the setting of the rounding mode is much faster than the
addition).  It might also not be true on some pipelined processors, i.e.
if the multiply unit is not pipelined.  Etc.  Do you indeed know machines
where the above sequence would be four times as slow as a normal multiply?

          It will be at least 4 times slower on the Risc System 6000.
It may be more if the set floating mode instructions screw up the pipe-
line (I don't know if they do or not).
          I said
 >                                             Also lets see your code
 > for interval multiplication.
          Dik Winter:
Not difficult, it goes something like:
        if a.low >= 0.0 then
                if b.low >= 0.0 then
                        4 operations
                else if b.upp <= 0.0 then
                        4 operations
                else if b.upp >= - b.low then
                        4 operations
                else
                        4 operations
the remainder is left as an exercise for the reader.  The code is a bit
hairy, and on modern RISC machines you would not put it inline (a subroutine
jump to a leaf routine is extremely cheap).  We see that also here on
many machines the multiply operations take most of the time.  (The code
assumes that intervals do not contain inf, if you allow that the code only
gets hairier, not much more time consuming.)

          Some comments:
     1.  Handling this by a subroutine call would seem to require 4
floating loads to pass the input arguments and 2 floating loads to re-
trieve the answers.  This is already expensive.
     2.  All the floating compares and branches will severely impact
performance on the Risc System 6000.  The slowdown will be much more
than 3x.
     3.  Do you know of any machine where the above code will average
3x (or less) the time of a single multiply?
          On another topic Dik Winter said:
But you can get reasonable results without any pivoting if the condition
is very good!

          The need for pivoting is unrelated to the condition number.
This matrix 0 1 has condition number 1 and requires pivoting.
            1 0
This matrix 1+e 1-e (e small) has bad condition number and is not help-
            1-e 1+e           ed by pivoting.
                      James B. Shearer