Path: utzoo!utgpu!attcan!uunet!yale!mfci!colwell From: colwell@mfci.UUCP (Robert Colwell) Newsgroups: comp.arch Subject: HW sqrt/div (was RISC v. CISC --more misconceptions) Message-ID: <544@m3.mfci.UUCP> Date: 3 Nov 88 13:41:50 GMT References: <156@gloom.UUCP> <18931@apple.Apple.COM> <40@sopwith.UUCP> <19762@apple.Apple.COM> <1002@l.cc.purdue.edu> <19811@apple.Apple.COM> Sender: colwell@mfci.UUCP Reply-To: colwell@mfci.UUCP (Robert Colwell) Organization: Multiflow Computer Inc., Branford Ct. 06405 Lines: 21 In article <19811@apple.Apple.COM> baum@apple.UUCP (Allen Baum) writes: >Square root is the same category as divide. Hardware is slow, so algorithms >tend to avoid them. The reason is fundamental. The hardware is slow, and it >is exceedingly difficult to make it faster. Strangly enough, floating point >divide can be made to run much faster, because of its normalized operands. One of the biggest problems with hardware sqrt/divide is that their hardware implementations want to be iterative, which makes these ops non-pipeline-able. That's a very bad feature in machines where all other arithmetic ops, esp. flt. pt. multiply/adds are pipelined. A software implementation of sqrt or div uses the pipelined ops, so the net effect is that the latency of a single op will be higher, but the net throughput is much better. Of course, the hardware can get you the last bit correctly rounded to IEEE specifications; the software could too, in principle, but I've not yet seen anyone do it. Bob Colwell mfci!colwell@uunet.uucp Multiflow Computer 175 N. Main St. Branford, CT 06405 203-488-6090