Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!samsung!brutus.cs.uiuc.edu!psuvax1!psuvm!cmh117 From: CMH117@PSUVM.BITNET (Charles Hannum) Newsgroups: comp.graphics Subject: Re: APPROXIMATING SQUARE_ROOT Message-ID: <89331.211723CMH117@PSUVM.BITNET> Date: 28 Nov 89 02:17:23 GMT References: <1989Oct26.155201.5087@rpi.edu> <14397@well.UUCP> <367@oscsuna.osc.edu> <63127@psuecl.bitnet> <542@xdos.UUCP> Distribution: usa Organization: Penn State University Lines: 53 In article <542@xdos.UUCP>, doug@xdos.UUCP (Doug Merritt) says: > >In article <63127@psuecl.bitnet> c9h@psuecl.bitnet writes: >>Even with floating point numbers, and divide by 2 can be extremely fast >>if coded such that the exponent is simply decremented by one. I would hope >>that many compilers do this optimization automatically, but I don't know. > >This breaks when the number is not an exact power of two, because you're >not doing anything to the mantissa. So there'd better not be any compilers >that do this. We don't need to do anything to the mantissa. The exponent represents powers of two. This, multiplied by the mantissa (adding the leading 1 if necessary) yields the number. Simply incrementing or decrementing the exponent will multiply or divide by powers of 2. >>On the other hand, if you want to take the square root of an integer, you >>could simply keep subtracting successive odd numbers until the next odd >>number is larger than your accumulator. The number of odd numbers thus >>subtracted is the greatest integer function of the square root. i.e.: > >This is O(sqrt(N)), whereas e.g. binary search is O(log(N)). I would think >that this might be a deoptimization on machines with an efficient >multiply. Binary search requires an inner loop multiplication, but yours >requires lots more worst-case iterations. > >For 16 bit integers, your algorithm's worst case is 255 iterations, >where binary search is 16. So the break-even point is a machine where >16-bit multiplication is 15 to 16 times slower than addition. (I've >misplaced my 68000 manual, but from memory I think the ratio of times for >addition versus multiplication make it approximately break-even for >that processor.) > >For 32-bit integers, your worst case is 65535 iterations, which couldn't >possibly beat binary search's 32 iterations on any machine, even with >mildly inefficient software-synthesized multiplication. > >But if your inputs are limited to say, 2^10 or less, I guess it could >be a win. > >Clever algorithm for the right circumstances, but not a general case >optimization. It wasn't meant to work in the general case. Hell, it would be ridiculous to use it for even 16-bit integers if your processor has a fast divide. It's really only good for 8-bit numbers ... and then you might as well use a lookup table. The iterative method is a much better solution. -- - Charles Martin Hannum II "Klein bottle for sale ... inquire within." (That's Charles to you!) "To life immortal!" cmh117@psuvm.{bitnet,psu.edu} "No noozzzz izzz netzzzsnoozzzzz..." c9h@psuecl.{bitnet,psu.edu} "Mem'ry, all alone in the moonlight ..." Brought to you by Super Global Mega Corp .com