Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!rpi!sarah!bingnews!kym From: kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell) Newsgroups: comp.lang.c Subject: Re: Log Library - How is it done in the library code? Keywords: log, library, series expansion Message-ID: <1991Mar16.201655.6104@bingvaxu.cc.binghamton.edu> Date: 16 Mar 91 20:16:55 GMT References: <15438@smoke.brl.mil> <1991Mar12.014416.4289@m.cs.uiuc.edu> <702@newave.UUCP> Organization: State University of New York at Binghamton Lines: 145 In article <702@newave.UUCP> john@newave.mn.org (John A. Weeks III) writes: >In article <1991Mar12.014416.4289@m.cs.uiuc.edu> joshi@m.cs.uiuc.edu (Anil Joshi) writes: [complains about library logs] >As an experiment, why not try running your program using log, then >try one of the alternate solutions. Compare the time it takes to run >each example. Then post your results to the net. While I would generally agree with John -- that micro-optimization of one or two instructions or function calls seldom (if ever) improve the overall performance of typical programs -- I'm going to play devil's advokate here. The following is a simple-minded routine that uses a table lookup to comupte an approximate natural log for 16-bit integers. It could be easily recoded to get logs of doubles or floats, and additionally you might like to remove the floating-point operations and replace them by fixed-point. (In fact in a similar problem posted in comp.arch regarding division, speeds for a similar routine approached almost twice the speed of the FLOATING POINT HARDWARE on some machines). Another point to consider -- the initial `find first 1' logic might be coded as a single instruction on some machines. The results vary over different machines -- some with fp hardware; some with `no' hardware :-). The results are as follows: Sun 3/60: 0.368515 Vax 8530: 0.421875 Sun SS1: .6 MIPS Magnum: 1 Vax 6440: 1.64706 So we see that on _some_ hardware (like 68k's) the library routines are at an apparent _big_ disadvantage; on some other platforms (the 6440 running VMS) the situation is reversed. I particularly like the MIPS where things everything's equal. :-) Although I realize I'm comparing apples to oranges here (my log routine only handles 16-bit integers; not the whole spectrum of double-precision arguments as does the C library routine) -- that _was_ the point of the original posting: can we ever take advantage of the fact that what we wish to do is somewhat less general than the library designer's intended? As John said in his post referenced above, the answer is a categorical ``it depends''. You be the judge from the figures above. -kym C program follows: ==================== log.c ==================== #define NDEBUG #include #include #include /* Taylor series: f(x+h) = f(x) + h f'(x) + 1/2 h^2 f''(x) + ... For f(x) = log(x) we have: f(x) = log(x) f'(x) = 1/x f''(x) = -1/x^2 f'''(x) = 2/x^3 Evaluate each of f's at x=1. f(1) = 0 f'(1) = 1 f''(1) = -1 f'''(1) = 2 Therefore: log(1+x) = = 0 + h 1 + 1/2 h^2 (-1) + 1/6 h^3 (2) + 1/24 h^4 (-6) ... = h - h^2/2 + h^3/3 - h^4/4 + ... If we wish to find: log(L a + b) = log(La (1 + b/(La))) = log(L) + log(a) + log(1+b/(La)) where L = 2^k we have: log(2^k) = k log(2) and therefore: log(x) = k*log(2) + log(a) + log(1+b/(La)) But as we know log(1+b/(La)) =about b/(La) - 1/2 (b/(La))^2 therefore log(x) =about k/lg(2) + log(a) + 1/L b/a - 1/2 1/L^2 (b/a)^2 */ #define NTAB 256 /* size of table */ #define NSAMP 10000L /* number of samples in test */ double atab[NTAB]; init(){ int i; for(i=1;i=0); assert(a