Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!usc!bbn!bbn.com!slackey From: slackey@bbn.com (Stan Lackey) Newsgroups: comp.arch Subject: Re: John von Neumann, sqrt instr Message-ID: <44558@bbn.COM> Date: 21 Aug 89 14:30:55 GMT References: <21353@cup.portal.com> <25643@obiwan.mips.COM> <1513@l.cc.purdue.edu> <2376@wyse.wyse.com> Sender: news@bbn.COM Reply-To: slackey@BBN.COM (Stan Lackey) Distribution: usa Organization: Bolt Beranek and Newman Inc., Cambridge MA Lines: 24 In article <2376@wyse.wyse.com> stevew@wyse.UUCP (Steve Wilson xttemp dept303) writes: >In article <1513@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >>There is no reason why a machine with hardware division should not have >>hardware square root. It costs almost nothing. >Ah, but that presumes that hardware division is warranted! >Does the occurance rate of divide/square root in scientific computing >justify the cost? >How does the scientific computing community feel about this functionality? I was once involved in a trade-off between doing a divider gate array, or just iterating through the multiplier. Performance was that for a multiply of 3 cycles, the division would take 18. Also, vector multiplication would be one chime, and vector division 18. (We still considered it worthwhile to keep the vector division instructions, in case a future implementation had a faster one and also to make things easier for the compiler.) We found that over a wide range of applications the ratio of mul to div was like 3:1 to 5:1. Worse still, Ahmdahl's Law predicted that with those numbers we'd be spending nearly 50% of our time doing division!! In the applications we wanted to do well at, that is. We did the gate array. And yes it has square root. I personally don't know how important sqrt is, but it was cheap enough. -Stan