Path: utzoo!attcan!uunet!lll-winken!lll-tis!mordor!joyce!zodiac!ames!mailrus!cornell!uw-beaver!uoregon!will From: will@uoregon.uoregon.edu (William Clinger) Newsgroups: comp.lang.lisp Subject: Re: Summing a list Message-ID: <3065@uoregon.uoregon.edu> Date: 31 Oct 88 21:45:43 GMT References: <10794@srcsip.UUCP> <10813@srcsip.UUCP> <6135@csli.STANFORD.EDU> <29718@think.UUCP> <6143@csli.STANFORD.EDU> Reply-To: will@fog.UUCP (William Clinger) Organization: University of Oregon, Computer Science, Eugene OR Lines: 176 My apologies for continuing this discussion, but I grew tired of seeing timings without analysis and I thought I might as well prepare a lecture for this weird "Programming in Lisp" course I have to teach... I...obtained the following results under Allegro CL on a Mac II. Interestingly, the non-tail recursive version (as originally provided) is faster than the tail-recursive one! (Or have I made an error in the coding?) Mark Johnson Putting Mark's data with my measurements on a Macintosh II shows some of the dangers of generalizing from one implementation: Common Lisp Scheme comment NOTHING .00 .02 no-op DO-LOOP 1.85 6.92 loop using an assignment D0-LOOP1 2.85 * 1.07 loop using do TRECURSIVE 5.95 1.08 tail-recursive, local procedure TRECURSIVE1 3.67 1.05 tail-recursive, global procedure RECURSIVE 5.25 3.40 non-tr, local procedure RECURSIVE1 2.83 3.72 non-tr, global procedure REDUCE-+ 6.25 15.97 (reduce + list1000) APPLY-+ .92 --- (apply + list1000) EVAL-+ 12.32 6106.78 (eval (cons '+ list1000)) Considering Mark's question first, I too would have expected the tail-recursive version to be faster than the non-tail-recursive version, as it was in Scheme. Part of the answer, I think, is that Allegro CL does a pretty good job with non-tail-recursion but doesn't compile tail recursion as efficiently as it might. Another thing to consider is that the tail-recursive version has an extra argument to deal with. Evidently the overhead of passing that extra argument is about the same as the extra cost of non-tail-recursion compared to tail recursion in Allegro CL. MacScheme, on the other hand, does a poor job with non-tail-recursion so the extra argument is insignificant by comparison. While some might consider APPLY the most elegant solution, and it appears to be the fastest in Allegro CL, it isn't a portable solution. Implementations of Common Lisp are explicitly allowed to place limits on the number of arguments passed to a procedure, and the limit may be as small as 50. The Scheme language doesn't say anything about this, but MacScheme has such a limit anyway (about 250). Some Common Lisps compile tail-recursion as though it were non-tail-recursion, so TRECURSIVE and TRECURSIVE1 are not portable either in Common Lisp if you may have to deal with long lists. EVAL is a pretty clear loser, but why did MacScheme take almost two hours? Most implementations of Common Lisp use an interpretive EVAL, but Scheme systems often use a compiling EVAL. Furthermore the MacScheme compiler has a minor bug that makes the worst-case compile time quadratic, rather than linear. What's the bug? "If we're compiling an expression that looks like (+ E1 E2 ...), and its LENGTH is greater than 2, then change it to (+ (+ E1 E2) ...) and try again." In other words, EVAL-+. In Scheme you can pretty well count on DO-LOOP1, TRECURSIVE, and TRECURSIVE1 compiling into nearly identical code because they usually are expanded into nearly identical code before the compiler proper ever sees them. TRECURSIVE1 might be slower because fetching the global value of TRECURSIVE1 for the tail-recursive call may involve an actual variable reference, while the local variable references for the other two would certainly be optimized away. It seems like the same should be true of Common Lisp, but in fact many CL compilers treat do loops and top-level procedures specially, leaving locally defined procedures to the mercy of the compiler's most general algorithms. With such a compiler I would expect DO-LOOP1 to be fastest and TRECURSIVE slowest, which is what we observe. This probably explains also why RECURSIVE was slower than RECURSIVE1 in Common Lisp. Using REDUCE for this problem means calling the general version of +, which I would expect to be slow because it involves a rest argument. It might be slow for other reasons also, depending on how REDUCE is coded. Common Lisp was only twice as slow for REDUCE-+ as for RECURSIVE1, which isn't too bad. I wrote REDUCE for Scheme, using a subset of the CL semantics. A curried REDUCE (e.g. ((REDUCE +) LIST1000) instead of (REDUCE + LIST1000)) would have been more in the spirit of Scheme and would not have made any significant difference to the timings. In my opinion a curried REDUCE supplies the most elegant solution to the problem of summing a list: (DEFINE MY-LIST-SUMMATION-PROCEDURE (REDUCE +)). It's also the slowest of the reasonable solutions. The fastest portable solution is tail-recursion for Scheme and some kind of DO loop for Common Lisp. The obvious explanation for the DO-LOOP anomaly is that the Scheme version uses a procedure, FOR-EACH, while the CL version uses a macro, DOLIST, causing the Scheme version to execute 100,000 full-scale procedure calls compared with none for the CL version. I wrote DO-LOOP1 as a similar benchmark that could be translated more easily between the two languages. The starred Common Lisp timing was mine, not Mark's, and thus may not be directly comparable to the others in the CL column. Peace, William Clinger an author of MacScheme Scheme code (assuming a timeit macro): (define (sum-test list . rest) (let ((count (if (null? rest) 100 (car rest)))) (define (repeat thunk) (define (loop n) (if (zero? n) 'done (begin (thunk) (loop (- n 1))))) (loop count)) (define (my-false . x) #f) (define (nothing) (repeat (lambda () (my-false list)))) (define (do-loop) (repeat (lambda () (let ((sum 0)) (for-each (lambda (item) (set! sum (+ sum item))) list) sum)))) (define (do-loop1) (repeat (lambda () (do ((sum 0 (+ sum (car list))) (list list (cdr list))) ((null? list) sum))))) (define (recursive) (define (recursive-call list) (if (null? list) 0 (+ (car list) (recursive-call (cdr list))))) (repeat (lambda () (recursive-call list)))) (define (trecursive) (define (trecursive-call list sum) (if (null? list) sum (trecursive-call (cdr list) (+ (car list) sum)))) (repeat (lambda () (trecursive-call list 0)))) (define (recursive1) (repeat (lambda () (recursive1-call list)))) (define (trecursive1) (repeat (lambda () (trecursive1-call list 0)))) (define (eval-+) (repeat (lambda () (eval (cons '+ list))))) (define (apply-+) (repeat (lambda () (apply + list)))) (define (reduce-+) (repeat (lambda () (reduce + list)))) (timeit (nothing)) (timeit (do-loop)) (timeit (do-loop1)) (timeit (recursive)) (timeit (trecursive)) (timeit (recursive1)) (timeit (trecursive1)) (timeit (reduce-+)) ;(timeit (apply-+)) (timeit (eval-+)))) (define (recursive1-call list) (if (null? list) 0 (+ (car list) (recursive1-call (cdr list))))) (define (trecursive1-call list sum) (if (null? list) sum (trecursive1-call (cdr list) (+ (car list) sum)))) (define (reduce f l) (define (loop x l) (if (null? l) x (loop (f x (car l)) (cdr l)))) (loop (car l) (cdr l))) (define (iota n) (do ((n n (1- n)) (l '() (cons n l))) ((zero? n) l))) (define x (iota 1000)) (sum-test x)