Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!mcsun!hp4nl!cwi.nl!dik
From: dik@cwi.nl (Dik T. Winter)
Newsgroups: comp.arch
Subject: Re: IEEE arithmetic (Goldberg paper)
Message-ID: <3695@charon.cwi.nl>
Date: 13 Jun 91 23:58:23 GMT
References: <9106120018.AA18733@ucbvax.Berkeley.EDU>
Sender: news@cwi.nl
Organization: CWI, Amsterdam
Lines: 53

In article <9106120018.AA18733@ucbvax.Berkeley.EDU> jbs@WATSON.IBM.COM writes:
 >  >          I don't believe interval arithmetic is used enough to justify
 >  > any hardware support.
Me:
 > Well, IBM thought it important enough to provide support in some models (43xx).
JBS:
 > Acrith is the keyword.
 >          Ah Acrith, I will let Kahan comment ...
No need, I know Kahan's opinions about Acrith.  Still IBM thought it
important enough to put it in hardware (actually microcode I think).
Perhaps under severe pressure from IBM Germany?  But at least as far as
IBM saw it there was a niche where interval arithmetic is heavily used.

Me:
 > How come unrealistic?  Example, interval add a to b giving c:
 >         set round to - inf;
 >         c.low = a.low + b.low;
 >         set round to + inf;
 >         c.upp = a.upp + b.upp;
 > why would this be more than 3x slower than straight floating point?
JBS:
 >         Because it's 4 instructions vrs 1.
Thus it would be 4 times as slow if each instruction takes exactly the same
time to issue.  This is not true on processors where the FPU is not pipelined
(in that case the setting of the rounding mode is much faster than the
addition).  It might also not be true on some pipelined processors, i.e.
if the multiply unit is not pipelined.  Etc.  Do you indeed know machines
where the above sequence would be four times as slow as a normal multiply?

JBS:
 >                                             Also lets see your code
 > for interval multiplication.
Not difficult, it goes something like:
	if a.low >= 0.0 then
		if b.low >= 0.0 then
			4 operations
		else if b.upp <= 0.0 then
			4 operations
		else if b.upp >= - b.low then
			4 operations
		else
			4 operations
the remainder is left as an exercise for the reader.  The code is a bit
hairy, and on modern RISC machines you would not put it inline (a subroutine
jump to a leaf routine is extremely cheap).  We see that also here on
many machines the multiply operations take most of the time.  (The code
assumes that intervals do not contain inf, if you allow that the code only
gets hairier, not much more time consuming.)

Of course division is still hairier, but still reasonably possible.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl