Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!rutgers!apple!bbn!bbn.com!slackey From: slackey@bbn.com (Stan Lackey) Newsgroups: comp.arch Subject: Re: Bandwidth and RISC vs. CISC Message-ID: <39095@bbn.COM> Date: 25 Apr 89 15:50:11 GMT References: <38853@bbn.COM> <423@bnr-fos.UUCP> <17417@cup.portal.com> <39049@bbn.COM> <100891@sun.Eng.Sun.COM> Sender: news@bbn.COM Reply-To: slackey@BBN.COM (Stan Lackey) Organization: Bolt Beranek and Newman Inc., Cambridge MA Lines: 46 In article <100891@sun.Eng.Sun.COM> dgh%dgh@Sun.COM (David Hough) writes: >EXACTLY .5 is no harder than correct directed rounding. You have to >(in principle) develop all the digits, propagate carries, and remember >whether any shifted off were non-zero. Division and sqrt are simplified >by the fact that EXACTLY .5 can't happen. OK, it's only a problem in multiplication. >> Note: It's prealigning a denormalized operand before a multiplication >> that REALLY hurts. >This event is rare enough that it needn't be as fast as a normal >multiplication, so it's OK to slow down somewhat by holding the CPU, >but not so rare that you want to punt to software. By throwing enough >hardware at the problem you can make it as fast as the normal case. >I don't advocate that but that's my understanding of what the Cydra-5 did. Ever design a pipelined machine? It was probably easier in the Cydra to make everything assume the worst case, than to deal with the pipeline getting messed up. The new micros (at least the i860) trap and expect software to fix things up, which includes parsing the instructions in the pipe, and fixing up the saved version of the internal data pipeline. I've seen statements in this newsgroup like "not usable in a general purpose environment" when referring to the i860. Talk about debug time! In the Alliant we wanted to get the design done, and fit it on one board, so we shut denorms off. (It sets the exception bits, though.) After shipping for 4 years, there have still been no complaints. >for instance nowadays >everybody has correctly-rounded division and sqrt in hardware, except Intel Re: one or two-cycle DP IEEE mul/add exist: Alliant is the only one I know of, but it's because 1) the cycle time is abnormally long and 2) denorms are not supported. I think it's valid to say that if floating point (esp DP) ops take one cycle, your cycle time is too long. >> I think the RISC >> implementers should have a RISC-style floating point standard, though. >DEC VAX floating-point architecture >is well defined and a number of non-DEC implementations are available. Sounds like a good idea to me! The IBM one is not useful, and (so it is said) the Cray one is difficult to use. The VAX one is accurate enough and has enough range for normal use, and if F or G aren't enough, there's always H :-) -Stan