Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!samsung!brutus.cs.uiuc.edu!psuvax1!psuvm!cmh117
From: CMH117@PSUVM.BITNET (Charles Hannum)
Newsgroups: comp.graphics
Subject: Re: APPROXIMATING SQUARE_ROOT
Message-ID: <89331.211723CMH117@PSUVM.BITNET>
Date: 28 Nov 89 02:17:23 GMT
References: <1989Oct26.155201.5087@rpi.edu> <14397@well.UUCP>
<367@oscsuna.osc.edu> <63127@psuecl.bitnet> <542@xdos.UUCP>
Distribution: usa
Organization: Penn State University
Lines: 53
In article <542@xdos.UUCP>, doug@xdos.UUCP (Doug Merritt) says:
>
>In article <63127@psuecl.bitnet> c9h@psuecl.bitnet writes:
>>Even with floating point numbers, and divide by 2 can be extremely fast
>>if coded such that the exponent is simply decremented by one. I would hope
>>that many compilers do this optimization automatically, but I don't know.
>
>This breaks when the number is not an exact power of two, because you're
>not doing anything to the mantissa. So there'd better not be any compilers
>that do this.
We don't need to do anything to the mantissa. The exponent represents powers
of two. This, multiplied by the mantissa (adding the leading 1 if necessary)
yields the number. Simply incrementing or decrementing the exponent will
multiply or divide by powers of 2.
>>On the other hand, if you want to take the square root of an integer, you
>>could simply keep subtracting successive odd numbers until the next odd
>>number is larger than your accumulator. The number of odd numbers thus
>>subtracted is the greatest integer function of the square root. i.e.:
>
>This is O(sqrt(N)), whereas e.g. binary search is O(log(N)). I would think
>that this might be a deoptimization on machines with an efficient
>multiply. Binary search requires an inner loop multiplication, but yours
>requires lots more worst-case iterations.
>
>For 16 bit integers, your algorithm's worst case is 255 iterations,
>where binary search is 16. So the break-even point is a machine where
>16-bit multiplication is 15 to 16 times slower than addition. (I've
>misplaced my 68000 manual, but from memory I think the ratio of times for
>addition versus multiplication make it approximately break-even for
>that processor.)
>
>For 32-bit integers, your worst case is 65535 iterations, which couldn't
>possibly beat binary search's 32 iterations on any machine, even with
>mildly inefficient software-synthesized multiplication.
>
>But if your inputs are limited to say, 2^10 or less, I guess it could
>be a win.
>
>Clever algorithm for the right circumstances, but not a general case
>optimization.
It wasn't meant to work in the general case. Hell, it would be ridiculous
to use it for even 16-bit integers if your processor has a fast divide.
It's really only good for 8-bit numbers ... and then you might as well use
a lookup table. The iterative method is a much better solution.
--
- Charles Martin Hannum II "Klein bottle for sale ... inquire within."
(That's Charles to you!) "To life immortal!"
cmh117@psuvm.{bitnet,psu.edu} "No noozzzz izzz netzzzsnoozzzzz..."
c9h@psuecl.{bitnet,psu.edu} "Mem'ry, all alone in the moonlight ..."