Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!bellcore!decvax!decwrl!pyramid!pesnta!phri!cmcl2!lanl!jlg
From: jlg@lanl.UUCP
Newsgroups: net.arch,net.math
Subject: Re: IEEE f.p. standard
Message-ID: <2225@lanl.ARPA>
Date: Tue, 22-Apr-86 16:14:49 EST
Article-I.D.: lanl.2225
Posted: Tue Apr 22 16:14:49 1986
Date-Received: Thu, 24-Apr-86 06:46:02 EST
References: <2048@lanl.ARPA> <5335@alice.uUCp>
Reply-To: jlg@a.UUCP (Jim Giles)
Organization: Los Alamos National Laboratory
Lines: 56
Xref: watmath net.arch:3161 net.math:3117

In article <5335@alice.uUCp> ark@alice.UucP (Andrew Koenig) writes:
>> COS(X) is defined as SIN(X-PI/2).  Now for |X| <= PI, we require that
>> SIN(X) = RND(sin(X)).  This is the correctly rounded value of the function
>> value.
>
>Unfortunately, even this fairly simple definition is quite hard to implement.

It's not really HARD to implement so much as it's slow.  Since all values
of X are rational (of the form I*2^J, where I and J are integers), you can
implement the above requirements using CORDIC or some similar algorithm.
A side effect of the CORDIC algorithm is that it leaves a 'remainder' after
each iteration so that, after the last iteration, the remainder can be used
to tell you which way to round.

The value X must be rational for two reasons.  First: the SIN of a rational
is always irrational (except zero - an easy case), therefore there is no
possibility of a 'tie' - you will always have a remainder and the direction
to round will always be unambiguous.  Second: the calculated remainder is
always 'accurate', assuming that the CORDIC coefficients are chosen
properly. ('Accurate' is in quotes here because it is usually NOT the real
remainder, but only one that is guaranteed to generate the correct result.
This is true because the CORDIC coefficients are only stored to finite
precision themselves.)

For all this to work, the CORDIC algorithm must be carried out with about
twice the precision of the target result.  CORDIC is slow anyway, and
double precision intermediate calculations make it worse.  However it is
*possible* to achieve the precision stated above.

As I said, I am designing hardware for the Trig functions mainly for
recreation.  Of all the algorithms I've tried, only a few could give the
accuracy I wanted.  And of those, only things like CORDIC are easily
implemented in hardware.  For a double-extended SIN function, CORDIC
requires 64 iterations through the algorithm.  If a bit-sliced hardware
version could be made that cycles at about 100 ns per iteration, the SIN
function would take about 7 micro secs (not including argument reduction) -
hardware is available that could do it (a lot of expensive chips though).
I wonder how much trouble it would be to put it in VLSI, and how much it
would be slowed.

My only other experience with this issue is what can be found in
mathematical support libraries on mainframe computers.  Both the Cray
and CDC (NOS) support libraries use polynomial interpolants, the error
characteristics of which are arcane (to say the least).  Rational
polynomial interpolants are better but a rational polynomial requires a
divide - which, in hardware, has some of the same implementation
complexities as the trig functions (ie. it must be done iteratively, the
true round is sometimes hard to find, or it is slow - like the shift
subtract algorithm:  VERY similar to CORDIC trig functions).

Presumably, other people are working at this problem at a more professional
level.  What algorithm is presently used for this kind of thing?  What
error bounds are accepted?

J. Giles
Los Alamos