Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!wuarchive!usc!sdd.hp.com!ucsd!pacbell.com!pacbell!att!cbnewsc!sesv From: sesv@cbnewsc.att.com (steven.e.sommars) Newsgroups: comp.lang.c Subject: more than you wanted hear about pow(3M) Keywords: pow accuracy Message-ID: <1990Aug12.045525.25626@cbnewsc.att.com> Date: 12 Aug 90 04:55:25 GMT Organization: AT&T Bell Laboratories Lines: 175 Here's my contribution to the pow discussion. If you spot an error, please send me email corrections & I will summarize. ============================================================ What makes for a good pow(x,y)? An unordered, partial (yet overlapping) list might include: a) Accurate enough to satisfy users b) monotonic c) fast d) result doesn't overflow/underflow prematurely e) reproducible (does pow(x,y) vary over time or between machines?) f) function size (text+data) g) If pow(integer,integer) is exactly representable, that value should be returned. E.g., pow(2.,3.) should return exactly 8. h) Acceptable interface ``glue''. Returns what the user expects when given negative x and/or y, pow(0,0) (we all know the right answer to that one :-) ), one or both arguments are NaN's, arguments which cause over/underflow, ...and so forth. The list of special cases is long for pow(). i) Possible interface to traceback / exception handling mechanisms. j) adheres to whatever standard the user deems important (ANSI C, X/Open, SVID, NCEG, ...) k) support of all applicable rounding modes. l) if applicable, sets IEEE flags ``correctly''. Most pow users are willing to sacrifice one or more of these properties. How do computers compute pow(x,y) in production code? Here are some of the methods I know of. I'm sure that others exist. See the references if you want more details. 1) exp(y*log(x)). Inaccurate, as Andrew Koenig described. I've seen errors of about 1000 ulps on IEEE dp. However if you are on an architecture with enough bits of fast extended precision, this method may provide acceptable results. (If you have fast extra precision, other tricks can be used.) 2) Cody&Waite. More accurate than 1), but runs into trouble for some extreme cases, e.g., pow(1+epsilon,large) [Aside. I implemented this algorithm for IEEE dp. Andrew helped make my code portable (not an easy task). Alas, he also provided test cases to show where the Cody&Waite algorithm is ill-behaved. This is one reason why the code never made it into System V. It's still better than exp(y*log(x)).] 3) 4.3 BSD. A fairly sophisticated algorithm, implemented by K.C.Ng. It carefully uses exp(a*log(b)) in some regions, but simulates extra precision in the computations. The maximum observed error is about 140 ulps, but only in a small region. The typical error is much smaller. It is monotonic, has the pow(integer,integer) characteristic (see g) above). Many implementations, including SVR4, are based on this. 4) S. Gal's accurate table method: This is used in the IBM R6000, in IBM VS II Fortran (S/370), and perhaps elsewhere. 5) Another technique used in combination with a test for pow(x,integer). As noted by others, monotonicity may be violated when combining approximation techniques, even if both techniques are individually good. Also note that while repeated multiplication may be fast, it may be less accurate than a good pow() due to cumulative roundoff errors. 6) Some microprocessors have built in support for approximating transcendental functions. Sometimes this can be used for building an accurate pow. 7) Others. I have tested early versions of a table based pow designed by Peter Tang. IEEE dp accuracy is better than one ulp, usually less than 0.55 ulps. I don't know when the algorithm will be finalized/published. [Final aside. Here's one way to view the computation of exp(y*log(x)). let x+dx be the next IEEE dp number above positive x, x+dx=nextafter(x,HUGE) (Roughly, dx = x * 2^-53.) The value computed for pow(x,y) and pow(x+dx,y) should differ. However log(x) and log(x+dx) may be approximated by the same IEEE dp result! Thus pow(x,y) and pow(x+dx,y) may return the same value from the math library. To look at how this varies with x, consider the following table. x # of neighboring double precision numbers near x for which log(x) == log(neighbor) when the results are rounded to d.p. 2^10 4 2^20 8 2^40 16 2^80 32 For large x, log(x) loses the ability to distinguish arguments. For y~0, exp(y) does the same. Thus exp(y*log(x)) has problems.] ============================================================ References: R.C.Agarwal, et al., ``New Scalar and Vector Elementary Functions for the IBM System/370'' IBM J. Res. Develop, 30 March 1986. **Recommended reading. William J. Cody, Jr and William Waite, "Software Manual for the Elementary Functions," Prentice-Hall, 1980. **Two distinct uses for this book. a)cookbook for implementing **elementary function, b)gives code (elefunt) for testing **accuracy of implementations. Elefunt is available on netlib. S. Gal, ``Computing Elementary Functions: A New Approach for Achieving High Accuracy and Good Performance,'' IBM Technical Report 88.153, March 1985. **Techniques used for radix-16 floating point (see Agarwal paper). S. Gal, B. Bachelis ``An Accurate Elementary Mathematical Library for the IEEE Floating Point Standard,'' IBM Technical Report 88.223, June 1987. **Good background, but doesn't cover pow. **A revised version of this will go into ACM Transactions on **Math. Software. J.F. Hart, et al., "Computer Approximations." John Wiley & Sons, 1968. **One of the standard sources for approximations. Somewhat dated. ``IEEE Standard for Binary Floating-Point Arithmetic,'' ANSI/IEEE Std. 754-1985, IEEE, 1985. P. W. Markstein, ``Computation of elementary functions on the IBM RISC System/6000 processor,'' IBM J. Res. Develop, 34 January 1990. **Describes how >>correctly rounded<< elementary functions **can be implemented. A table based (Gal/Bachelis) technique produces **the properly rounded answer in 1023 out of 1024 cases. **The other 1 out of 1024 cases can be detected. **Without help, the approximation may round **to the wrong value, extra precision can be used to **recompute a better approximation. **This paper does not explicitly cover pow. Joel Silverstein, Steve Sommars, Yio-Chian Tao. ``The UNIX\(rg System Math Library, a Status Report'' January 1990 Usenix Proceedings (Washington). **Describes UNIX SVR3->SVR4 libm changes, **gives test results, includes pretty ulps plots. Eugene H. Spafford, John C. Flaspohler, ``A Report on the Accuracy of Some Floating Point Math Functions on Selected Computers,'' Georgia Tech. Technical Report GIT-SERC-86/02, GIT-ICS~85/06. **Survey of libm several years ago. Accuracy was **pretty bad. This was before 4.3BSD libm came out; **the situation is better now. ============================================================ Steve Sommars sesv@cbnewsc.att.com