Xref: utzoo sci.physics:17583 sci.math:15894 comp.theory.dynamic-sys:196
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!elroy.jpl.nasa.gov!decwrl!pa.dec.com!shlump.nac.dec.com!ryn.mro4.dec.com!allvax.enet.dec.com!jroth
From: jroth@allvax.enet.dec.com (Jim Roth)
Newsgroups: sci.physics,sci.math,comp.theory.dynamic-sys
Subject: Re: Newton's method (was square roots by hand or computer)
Message-ID: <4161@ryn.mro4.dec.com>
Date: 19 Mar 91 08:59:57 GMT
Sender: guest@ryn.mro4.dec.com
Followup-To: sci.physics
Organization: Digital Equipment Corporation
Lines: 52


In article <1991Mar19.023209.22985@jato.jpl.nasa.gov>, vsnyder@jato.jpl.nasa.gov (Van Snyder) writes...
>In article <1991Mar17.120702.9205@jarvis.csri.toronto.edu> hofbauer@csri.toronto.edu (John Hofbauer) writes:
>>    [lots of stuff deleted]
>>... The SQRT routine
>>on your favourite computer uses Newton's Method. The starting value
>>is determined from a simple linear min-max approximation which guarantees
>>that the first 3 bits are correct. Three iterations gives full 24-bit
>>single precision accuracy.

A fast way of priming a sqrt routine is to do a table indexing on the
exponent and the first 8 bits of the fraction - this gives about 8 bits
of relative accuracy which is then doubled for each further Newton
iteration.  This is noticibly faster than fiddling with minimax
linear approximations in a really optimized routine.

An interesting aside, the I860 processor does not have a divide
instruction, but it does have a couple of instructions that return
an 8 bit accurate floating point approximation to the reciprocal or
reciprocal of the root of a number, which are frequently needed in
graphics computations.  For these, Newton only needs multiplies
which are so fast and pipelined that the loss of a real divide is
not so bad (the divide on, say, a MIPS R2010 floating point processor,
takes numerous cycles, so this isn't as crazy as it sounds.)

>Most codes to find all roots of polynomials, e.g. Jenkins and Traub, use a
>process called "deflation", equivalent to synthetic division, to get rid of
>roots that have been already discovered.  I.e., after finding a1, divide the
>given polynomial by (x-a1).  They also use a variation of Newton's method
>called Miller's method, that finds complex roots.  To avoid these problems,
>at some expense in space, simply form the companion matrix and compute it's
>eigenvalues, using a routine from EISPACK.  This is easy since the companion
>matrix is already in Hessenberg form.

The trick of using the companion matrix is fairly neat, but a Jenkens
Traub rootfinder is faster particularly for high order polynomials.

Note also that Jenkens and Traub is really a form of shifted eigenvalue
solving anyway, it is in fact equivalent to shifted inverse powering and
Rayleigh quotient iteration.  Perhaps surprisingly a complex version
is slightly more accurate than one specially coded for the roots of real
polynomials, though the difference is not large in practice.

A Jenkens Traub finder is available from the ACM TOMS on netlib.

If you can avoid it, it's best not to have to represent polynomials in
polynomial form - for example in filter synthesis one needs roots of
the sum of two polynomials with known roots, and it's *much* more accurate
to directly iterate on the difference of the product form of those polynomials
rather than multiplying them out and forming a result polynomial.

- Jim