Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!cs.utexas.edu!sun-barr!newstop!sun!amdcad!light!bvs
From: bvs@light.uucp (Bakul Shah)
Newsgroups: comp.lang.postscript
Subject: Re: GCD algorithm
Message-ID: <1990Nov4.001609.5702@light.uucp>
Date: 4 Nov 90 00:16:08 GMT
References: <1990Oct31.182705.3184@light.uucp> <90305.105824CXT105@psuvm.psu.edu>
Reply-To: bvs@light.UUCP (Bakul Shah)
Organization: Bit Blocks, Inc.
Lines: 86

In an earlier article I gave two versions of a GCD algorithm and
claimed the recursive vesion was faster.  In response Christopher
Tate writes:

> I'm very suprised that the recursive version is faster than the iterative
> one; I suspect it has to do with the way recursion is implemented in
> PostScript.  Is it still faster if you don't bind the gcd procedures?  How
> about if you give it two large, almost-equal primes?  In most programming
> languages (i.e. native code compilation, etc.), given an iterative algorithm
> and an equivalent recursive algorithm, the iterative algorithm will be faster
> if only because it won't have to go through all the overhead of the recursive
> function calls....

In this case it seems the overhead to set up a loop and take it
down when done is higher than just calling a procedure.  Anyway,
after some more timing tests I find that which version is faster
depends on input numbers, on binding and on the interpreter.  See
test results below.  One surprise was that ghostscript 2.0 on a
sun 3/50 was about twice as fast as the QMS PS410 even though the
3/50 cpu is slower and spends some fraction of time in refreshing
the screen.

-- Bakul Shah
   bvs@BitBlocks.COM
   ..!{ames,att,decwrl,pyramid,sun,uunet}!amdcad!light!bvs

The test procedures:

  % {recursive, iterative}{bind def, def} versions of Euclid's gcd alg.
  /gcd1 { dup 3 1 roll mod dup 0 eq { pop } { gcd1 } ifelse } def
  /gcd2 { dup 3 1 roll mod dup 0 eq { pop } { gcd2 } ifelse } bind def
  /gcd3 { { dup 3 1 roll mod dup 0 eq { pop exit } if } loop } def
  /gcd4 { { dup 3 1 roll mod dup 0 eq { pop exit } if } loop } bind def

  % time to exec a procedure
  /time { /t0 usertime def exec usertime t0 sub } bind def

  /test {
    /ncycles exch def
    /num1 exch def
    /num0 exch def

    % compute overhead time for running the test
    /overhead { ncycles { num0 num1 pop pop } repeat } time def

    % collect test times in an array
    [
      [ /gcd1 load /gcd2 load /gcd3 load /gcd4 load ]
      % run each procedure ncyles times and compute run time.
      { /gcd exch def { ncycles { num0 num1 gcd pop } repeat } time } forall
    ]

    % subtract overhead from test times and print
    { overhead sub == ( ) print } forall
  } bind def

Some sample tests:
   31415  31415 1000 test
  111111  11111 1000 test
   11111 111111 1000 test
  123456  23456 1000 test

Results:  (time in millseconds, gs, ralpage run on a sun 3/50)

	gcd1    gcd2    gcd3    gcd4

ghostscript:
	 340     440     480     600     31415  31415 1000 test
	 660     860     820    1020    111111  11111 1000 test
	1020    1280    1200    1480     11111 111111 1000 test
	2300    2940    2600    3240    123456  23456 1000 test

QMS PS410:
	 736     860     900    1064     31415  31415 1000 test
	1472    1708    1564    1840    111111  11111 1000 test
	2204    2548    2224    2604     11111 111111 1000 test
	5132    5908    4872    5684    123456  23456 1000 test

RALpage:
	2450    3484    3000    4351     31415  31415 1000 test
	4934    6867    5267    7518    111111  11111 1000 test
	7433   10267    7516   10666     11111 111111 1000 test
       17634   23701   16534   23351    123456  23456 1000 test

BTW, the C version is about 50 times faster than the fastest
PostScript version under ghostscript.