Path: utzoo!utgpu!jarvis.csri.toronto.edu!clyde.concordia.ca!uunet!tut.cis.ohio-state.edu!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!apple!kchen From: kchen@Apple.COM (Kok Chen) Newsgroups: comp.graphics Subject: Re: Integral Square Root Message-ID: <37525@apple.Apple.COM> Date: 31 Dec 89 20:22:05 GMT References: <9170@cbmvax.commodore.com> <21550@mimsy.umd.edu> <9175@cbmvax.commodore.com> Organization: Apple Computer Inc., Cupertino, CA Lines: 143 mitchell@cbmvax.commodore.com (Fred Mitchell - PA) writes: >In article <21550@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: >>In article <9170@cbmvax.commodore.com> mitchell@cbmvax.commodore.com >>(Fred Mitchell - PA) writes: >>>Here's one that is short, fast, and sweet. >> >>... and wrong: the line >> >> if ( (rb = r+(b=1<> >>uses and sets rb in an unknown order. (Mr. Mitchell's compiler clearly >>... > >NOT WRONG- ... > >Now the one stipulation of the C standard is that INNER PARENTHESIS >GETS EVALUATED FIRST!!!! So the first thing that MUST be evaluated >first is: > (b = 1 << i) > >then: > (rb = r + b) > >and finally: > rb * rb <= n > >Don't fault Chris however. It is very terse and a bit hard to understand, >especially for neophyte C programmers. First, a comment and then another suggestion for doing integer square roots. I am afraid I have to agree with Mr. Torek on this one. The K&R definition only states that the PRIORITY of evaluation is as stated by Mr. Mitchell, not the TEMPORAL ORDER in which each node of the syntax tree is evaluated. I.e., the "rb" in the "* rb" branch of the expression could be evaluated BEFORE or AFTER the "rb =" branch is evaluated. The only thing a compiler writer has to make sure of is that the "*" operator is applied AFTER the "rb =". I wonder if other compiler writers agree on this one (I am not sure how KCC would have handled it, either; it's been too many eons ago. But, I suspect it would not yield the answer Mr. Mitchell desires. The reason is that KCC would always try to swap branches of a commutative operator so that the left branch has smaller Sethi weight than the right branch, before doing a left-to-right tree-walk, so as to minimize register usage. By doing so, it would first evaluate rb before going to the branch that assigns to it. Anyway, that topic belongs to a different news group :-). In any case, it is all quite moot, since there are other algorithms that have N (= number of bits) iterations that do not even involve multiplications. Such algorithms could be faster on machines that do not multiply as fast as they add or shift. One that I came up with a few years back (when I had need of a fast square root evaluator) is based on successive approximation and the fact that (a+b)^2 = (a^2 + 2ab + b^2). Each iteration achieves an extra bit of accuracy, but instead of having to square a number for the test, the (a+b)^2 trick is used. /* * Example of (16 bit) integer square root based * on (a+b)^2 = (a^2 + 2ab + b^2). Yields largest integer * whose square is less than or equal n. * * binValue is a table of the numbers 2^0, 2^1, 2^2, etc. * binSquare is a table of the square of 2^0, 2^1, 2^2, etc. * * At all times, succApproximation = iSqrt^2. */ int binValue[] = { 0x80, 0x40, 0x20, 0x10, 0x8, 0x4, 0x2, 0x1 } ; int binSquare[] = { 0x4000, 0x1000, 0x400, 0x100, 0x40, 0x10 , 0x4, 0x1 } ; int approxSquareRoot( n ) register int n ; { register int succApproximation, iSqrt, i, test, bitPosition ; succApproximation = iSqrt = 0 ; bitPosition = 8 ; /* 2.2^7 */ for ( i = 0; i < 8; i++ ) { /* * Let test be the square of the the sum * of the current approximation and the next * bit value, i.e., (iSqrt+binValue)^2. */ test = succApproximation /* a^2 */ + ( iSqrt << bitPosition ) /* 2.a.b */ + binSquare[i] ; /* b^2 */ /* * If test is no greater than n, binValue[i] can be added * to the current approximation of the square root. */ if ( test <= n ) { succApproximation = test ; iSqrt += binValue[i] ; } --bitPosition ; } return ( iSqrt ) ; } The above is written freehand and I don't know for sure it will compile :-). I am just trying to convey the flavour of the algorithm. The algorithm is so obvious that I am sure it has been written up somewhere already. My apologies to any prior discoverers for any lack of attribution to their works. In fact, I would not at all be surprised if this is in some piece of hardware somewhere since the two tables can just as well be generated by some shift register working in real time (indeed, if your machine can shift faster than it can index an element of a table, it would be faster to generate the table elements iteratively too). I am equally sure that there are faster ways to do square roots than this. (By the way, loonggg ago [like almost 25 years ago, when we were all programming in FORTRAN :-)], when I first got interested in algorithms, I looked up what the algorithm used by Abacus practitioners were, to obtain the square root of a number. Well, no big revalation - the algorithm is precisely the one we all learnt in primary school where you gather two digits at a time. Sigh.) Today? I try looking at a higher level - for algorithms that minimize the use of square roots. There ain't such a thing as a FAST square root. :-) Anytime you have to iterate 8 to 16 times to reach your results... Regards, Kok Chen kchen@apple.COM Apple Computer, Inc.