Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!wuarchive!udel!rochester!pt.cs.cmu.edu!o.gp.cs.cmu.edu!ram From: ram+@cs.cmu.edu (Rob MacLachlan) Newsgroups: comp.lang.scheme Subject: Re: Benchmarking Scheme Message-ID: <1991May3.172024.21889@cs.cmu.edu> Date: 3 May 91 17:20:24 GMT References: <1991May2.155838.20830@bronze.ucs.indiana.edu> Sender: netnews@cs.cmu.edu (USENET News Group Software) Distribution: comp.lang.scheme Organization: School of Computer Science, Carnegie Mellon Lines: 112 In article <1991May2.155838.20830@bronze.ucs.indiana.edu> huntley@copper.ucs.indiana.edu (Haydn Huntley) writes: >I also performed some similar benchmarks on a VAX 8650 and a Sun 3, >and compared your C version compiled with GCC, and your Scheme version >compiled with Chez Scheme 3.2 and found that the C versions were 20 to >30 times faster. > >On a Sun 3 with a 16Mhz 68020 (cssun2.ucs.indiana.edu) >(pi 100) takes 4.84 secs (Chez Scheme 3.2) >(pi 800) takes 246.46 secs (Chez Scheme 3.2) >time pi 100 takes .2 secs (GCC 1.39) >time pi 800 takes 10.0 secs (GCC 1.39) >ratio of Scheme to C is approx. 24 On a DECStation 3100 running Mach: (pi 800) takes 2.7 secs (CMU Common Lisp Apr-28-1991, speed 3, safety 0) time pi 800 takes 2.1 secs (MIPS cc -O4) ratio of Lisp to C is 1.28 >So what is going on here? Well, simply put, Scheme is not nearly as efficient >a language for executing this benchmark as C is. [...] Scheme can be a more >convenient language to program [...] while a language like C requires more >care. [...] On the other hand, C is usually much more efficient, because C >compilers can usually store variables in registers, and at the worst, they can >always put them on the stack, while Scheme compilers sometimes must store >things out in the heap, and use registers less advantagously than C compilers >do. Yes, register allocation and avoidance of number consing are crucial to good numeric performance. But Scheme could do it too. >In addition, Scheme has latent typing and generic arithmetic, which >require tests to be made at run-time, while C has static typing, so it >doesn't have to do any tests on the types of variables. Supporting dynamic typing and generic arithmetic doesn't mean you have to use it all the time. That's what type inference and declarations are for. >| So what are the prospects for machine independent performance >| measures? Part of the appeal to me of the pi program is that it does >| an actual difficult computation with the best code I can devise. >| Previously I used an intentionally less that optimal fibonacci. Both of these benchmarks can be meaningful, but they measure *totally* different things. The recursive fibonacci is basically a function call benchmark (like the traditional TAK) which means that it can probably predict the performance of most lisp programs within a factor of 3. Given constant compiler technology, integer Specmarks will be a much better predictor. This PI program is doing extended precision arithmetic. Unless you are implementing extended precision arithmetic in Lisp (unusual but possible), then you are better off with another benchmark. And if you do find a compiler that generates reasonable code, you are probably mainly testing the speed of integer divide. I suggest reading comp.benchmarks for a while if you are interested in general benchmarking concepts. >[...] I've been working on a Scheme compiler this semester, and I learned how >expensive it is to implement generic arithmetic, garbage collection, and type >checking! Each one of them costs something like a factor of 2 in performance! Well, they may, but they don't have to, as the CMU numbers demonstrate. Common Lisp is pretty much a superset of scheme, modulo call/cc. And I don't see why supporting call/cc should make this program run any slower, since it doesn't use call/cc, and doesn't call any functions that might. I believe that the slowdown in non-CMU Lisps is entirely due to generic and bignum arithmetic. Note that the original C program was written for 32 bit integers, although it presumably could have been written otherwise. Because of this, any Lisp without efficient support for full-word integers will lose horribly, not only calling generic arithmetic routines, but also using bignum arithmetic. You should be able to get reasonable performance in pretty much any Lisp by changing the program to use FIXNUM arithmetic and adding any necessary declarations. CMU CL has an edge here because: -- You don't have to change the program. -- You don't have to add declarations out the wazoo. -- Code runs reasonably fast, even with full type checking. Here is the CL version: ________________________________________________________________ (in-package "USER") (defconstant r 100000) (defun pi (n) (declare (type (integer 0 1000) n) (inline ceiling)) (let* ((n (ceiling n 5)) (m (truncate (* n 5 3322) 1000)) (a (make-array (+ 1 m) :initial-element 2 :element-type '(unsigned-byte 12)))) (setf (aref a m) 4) (do ((j 1 (+ 1 j)) (q 0 0) (b 2 (rem q r))) ((> j n)) (declare (type (unsigned-byte 31) j b q)) (do ((k m (- k 1))) ((zerop k)) (declare (type (unsigned-byte 12) k)) (setq q (+ q (* (aref a k) r))) (multiple-value-bind (quo rem) (truncate q (+ 1 (* 2 k))) (declare (type (unsigned-byte 19) quo)) (setf (aref a k) rem) (setq q (* k quo)))) (format t "~5,'0D" (+ b (truncate q r))) (if (zerop (mod j 10)) (terpri) (write-char #\space))) (terpri)))