Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!pacific.mps.ohio-state.edu!linac!att!ucbvax!WATSON.IBM.COM!jbs From: jbs@WATSON.IBM.COM Newsgroups: comp.arch Subject: Re: IEEE arithmetic Message-ID: <9106190252.AA29755@ucbvax.Berkeley.EDU> Date: 19 Jun 91 01:25:17 GMT Sender: daemon@ucbvax.BERKELEY.EDU Lines: 81 Dik Winter said: The origin of the discussion was a remark that interval arithmetic in software had been observed to be 300 times as slow as standard hardware floating point. While interval arithmetic had been observed to be 3 times as slow. Shearer questioned the number 3; I thought he questioned the order of magnitude but apparently he wants to know the exact number. Of course that can not be given as it is entirely machine dependent. Also the factor would be a function of the mix of operations performed. I believe the original remark referred to estimates not observa- tions. I questioned in passing whether 3x was a realistic estimate. I continue to believe it is extremely optimistic and that 10x slower would be more reasonable. Dik Winter: About the multiplication routine, JBS: > 1. Handling this by a subroutine call would seem to require 4 > floating loads to pass the input arguments and 2 floating loads to re- > trieve the answers. This is already expensive. But you have in most cases to load these 4 numbers anyhow, so why would this be more expensive? Why loads to retrieve the answers? Why not just let them sit in the FP registers? If you are doing your operations memory to memory then this is correct. However if you are keeping intermediate results in registers as will often be the case, then you must move your 4 operands from the registers they are in to the registers the subroutine expects them in and you must move your 2 answers out of the return registers into the registers you want them in (if you just let them sit they will be wiped out by the next call). In general it will be possible to avoid doing some of this movement but I don't see how to avoid all of it. Dik Winter: Moreover, never it was said that there is an exact factor of 3 involved; that factor was simply observed, for a set of programs. > 3. Do you know of any machine where the above code will average > 3x (or less) the time of a single multiply? So this is irrelevant. Who observed this? Under what conditions? Dik Winter: > On another topic Dik Winter said: > But you can get reasonable results without any pivoting if the condition > is very good! I should have added the context, where not only the condition of the complete matrix is very good, but also of all its principal minors. This still isn't right. Consider e 1 e small (but not zero). 1 e Dik Winter (in a later post): I am *not* an advocate for interval arithmetic (the people at Karlsruhe are). I do not use it. But I object to the way Shearer handles this: a. Shearer asks: what is the justification for the different rounding modes. b. Many responses come: interval arithmetic. c. Shearer asks: would it not be better helped with quad arithmetic? d. Response: observed speed difference a factor 3 with hardware rounding modes, a factor 300 in software. e. Shearer questions the factor 3. Apparently he believes the factor 300 (does he?). Even if the factor 3 would degrade on other machines to a factor of 5 or even 10, the difference with 300 is still striking. I ask Shearer again: come with an interval add assuming the base arithmetic is round to nearest only (or even worse, with truncating arithmetic, which you advocate in another article). Some comments: Regarding b if interval arithmetic is the only reason for the different rounding modes then I think they may safely be junked. Regarding c what I actually said was I thought quad precision would provide some support for interval arithmetic (not better support). In any case upon further reflection I will withdraw this statement. Regarding d as I said above I believe these were estimates. Regarding e I don't really believe the 300x either and I part- icularly don't believe the 100x ratio. Regarding how I would implement interval arithemtic it is not particularly difficult to do without the rounding modes as long as you don't insist on maintaining the tightest possible intervals. There is no great loss in being a little sloppy since an extra ulp only matters for narrow intervals and the main problem with interval arithmetic is that the intervals don't stay narrow even if you are careful. James B. Shearer