Path: utzoo!mnetor!uunet!husc6!bbn!uwmcsd1!ig!agate!ucbvax!hplabs!sdcrdcf!sdcsmb!sea!eggert From: eggert@sea.sm.unisys.com (Paul Eggert) Newsgroups: comp.lang.lisp Subject: Re: Lisp vs. C Floating Point (Suns) Message-ID: <29@sea.sm.unisys.com> Date: 7 Feb 88 05:20:38 GMT References: <557@spar.SPAR.SLB.COM> Reply-To: eggert@sea.sm.unisys.com (Paul Eggert) Organization: Unisys Santa Monica Lines: 107 In article <557@spar.SPAR.SLB.COM> malcolm@spar.slb.com (Malcolm Slaney) writes: The only way to really compare two languages for performance is to lock two hackers into two different rooms, feed them equal amounts of caffeinated soda (:-) and see which one is faster after a month. I wonder whether Slaney would still say this if he was locked in a room with the job of translating dhrystone into Lisp? (:-) I still don't think this benchmark is a good one. But I'll play the game by his rules for a few seconds. If minor changes to code are permitted (see the end of this note for details), then plain Sun C can run the FFT benchmark about a third faster than Lucid 2.1.1: Time (in seconds) to execute 10 iterations of a 1024 point FFT Sun-3/160 68881 single double C (SunOS 3.4) 3.5 3.7 Lucid 2.1.1 4.7 ? Slaney also writes: ... we *were* seeing floating point run 10 times slower than Lisp because of the need for boxing (tags) and no type propogation. I'm VERY happy to see that the lisp compilers are improving so much.... I'm also happy to see Lucid Lisp improving, and it's important to say that floating point need not cause one to shun Lisp. But I'm not yet convinced that Lucid Lisp and Sun C have similar floating point performance, even ignoring the the 35% performance difference reported above. First, no Lucid times for FPA-equipped Sun-3s or for Sun-4s were reported; what is the problem here? Second, many Lisp systems don't support fast double precision, which is crucial for many applications. Can the question mark in the table above be replaced by a hard number, so that we can see how well Lucid handles double precision? ---- The following changes to Slaney's (original) benchmark generate the performance figures described above. The changes to lines 178 and 264 fix bugs that don't affect CPU time -- but they lead me to suspect that there are more bugs! 18d17 < float fft_re[1025], fft_im[1025]; 19a19,22 > #ifndef real > #define real float > #endif > real fft_re[1025], fft_im[1025]; 31c34 < float areal[], aimag[]; --- > real areal[], aimag[]; 47,51c50,55 < register float *ar = areal, *ai = aimag; < register int i = 1, j = 0, k = 0, m = 0; < int n = 1024, nv2 = 512, le = 0, < le1 = 0, ip = 0; < float ur = 0.0, ui = 0.0, wr = 0.0, wi = 0.0, tr = 0.0, ti = 0.0; --- > register int i = 1, ip; > register real *ar = areal, *ai = aimag; > register double r, s, ur, ui, tr, ti; > register int le1, le, n = 1024, j, k, m = 0; > register int nv2 = n>>1; > register double wr, wi; 169,174c173,182 < tr = ar[ip]*ur - ai[ip] * ui; < ti = ar[ip]*ui + ai[ip] * ur; < ar[ip] = ar[i] - tr; < ai[ip] = ai[i] - ti; < ar[i] += tr; < ai[i] += ti; --- > r = ar[ip]; > s = ai[ip]; > tr = r*ur - s*ui; > ti = r*ui + s*ur; > r = ar[i]; > s = ai[i]; > ar[ip] = r - tr; > ai[ip] = s - ti; > ar[i] = r += tr; > ai[i] = s += ti; 178c186 < ti = ur * wi + wi * wr; --- > ui = ur * wi + ui * wr; 180d187 < ui = ti; 229c236 < float theta, phase; --- > double theta, phase; 231c238 < float f, c, s; --- > double f, c, s; 237c244 < float x; --- > double x; 261c268 < float re, im; --- > double re, im; 264c271 < if (abs(re) > fft_delta || abs(im) > fft_delta) --- > if (fabs(re) > fft_delta || fabs(im) > fft_delta)