Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!ejk
From: ejk@ux1.cso.uiuc.edu (Ed Kubaitis)
Newsgroups: comp.arch
Subject: yet-another-benchmark
Message-ID: <1989Oct29.134631.28539@ux1.cso.uiuc.edu>
Date: 29 Oct 89 13:46:31 GMT
Sender: paul@ux1.cso.uiuc.edu (Paul Pomes)
Reply-To: ejk@ux1.cso.uiuc.edu (Ed Kubaitis)
Organization: University of Illinois at Urbana
Lines: 150

Attached is yet-another-benchmark that might cast some light on aspects
of architecture.  As with all benchmarks, there is a very serious question 
of relevance to one's own applications. However, unlike many others, it 
is small enough to see in detail what is being measured.

The numbers reported are trips/processor_second through the loop below. The
calculation does not seem to lend itself to vector/parallel enhancements.

   int W, H, np, mxp, nP, mxP;
   double A, B, C;
   char *bmap;

   hopalong() {
      int wc=W/8, cx=W/2, cy=H/2, ix, iy; 
      double x=0, y=0, xx, yy, t;

      while (np < mxp && ++nP < mxP) {
	 t = sqrt(fabs(B*x-C));
	 xx = y - ( (x<0) ? t : -t );
	 yy = A - x;
	 x = xx; y = yy;
	 ix = cx + x; iy = cy + y;
	 if (ix>-1 && iy>-1 && ix<(W-1) && iy<(H-1)) {
	    bmap[iy*wc+(ix>>3)] |= 1<<(ix&7);
	    np++;
	    }
	 }
      }

It's building a bitmap of a fractal to display in an X root window. 
(Barry Martin algorithm published in A.K. Dewdney's "Computer Recreations" 
in the September 86 Scientific American.)

-------------------------------------------------------------------------------
Newsgroups: comp.windows.x
From: ejk@ux1.cso.uiuc.edu (Ed Kubaitis)
Subject: xfroot timing update
Date: Sun, 29 Oct 89 13:14:51 GMT

Here is the 4th updated list of xfroot fractal-points/processor_second 
measured on various clients. The number, a count of trips/second
through the 9 line "hopalong" loop in xfroot, is a rough index of scalar 
double-precision floating point uniprocessor speed. The lower number 
represents a case where nearly all points are in-range and thus require
additional integer arithmetic, bit manipulation, and memory accesses to
record the point. The higher number reflects a case when most points are
out of range and most time is spent in floating point arithmetic.  The
numbers in parentheses are VAX 780 equivalents. "*" indicates values for 
a single processor.  New items since the last posting are marked with ">".

       Cray 2    (scc)                 304000 (56.2)    619000(100.3)*
       Cray Y-MP (scc)                 316000 (58.4)    476000 (77.1)*
       Cray X-MP (scc)                 283000 (52.3)    415000 (67.3)*
       Cray X-MP (cc)                  157000 (29.0)    194000 (31.4)*
       Cray 2    (cc)                  129000 (23.8)    183000 (29.7)*
       Convex C2 (gcc)                 117000 (21.6)    151000 (24.5)*
       Convex C2 (vc3/fastmath)        108000 (20.0)    138000 (22.4)*
       Convex C2 (vc3)                  99000 (18.3)    118000 (19.1)*
       DEC DS5800                       95000 (17.6)    115000 (18.6)*
       HP9000/835CHX                    66000 (12.2)     92000 (14.9) 
       DEC DS5400                       77000 (14.2)     91000 (14.7) 
       DEC DS3100                       58000 (10.7)     75000 (12.2) 
    >  Solbourne Series5 Cypress        58000 (10.7)     67000 (10.9) 
       Gould NP1                        44000  (8.1)     60000  (9.7)*
       DEC Vax 6400 (vcc)               50000  (9.2)     57000  (9.2) 
       Convex C2 (vc2)                  49000  (9.1)     55000  (8.9)*
       Convex C2 (cc)                   41000  (7.6)     47000  (7.6)*
       Dec Vax 8650                     28000  (5.2)     33000  (5.3) 
       Sun Sparcstation 1              ~25000  (4.6)       ???  (???) 
       HP9000/370 (ffpa)                24000  (4.4)     28000  (4.5) 
       Titan                            22800  (4.2)     27100  (4.4) 
       DEC MV3900 (vcc)                 22900  (4.2)     26100  (4.2) 
       DG AViiON (88k 16.7 MHz)         17200  (3.2)     24200  (3.9) 
       Sun 4/260                        21100  (3.9)     23600  (3.8) 
       Dec Vax 8530                     19700  (3.6)     23200  (3.8) 
       Sun 4/280                       ~21000  (3.9)    ~23000  (3.7) 
       Dec Vax 8600                     19700  (3.6)     22400  (3.6) 
       DEC Vax 6220                     16800  (3.1)     19200  (3.1) 
       DEC MV3200 (vcc)                 15400  (2.8)     17500  (2.8) 
       IBM RT 135 (-f2 -lfm)            15200  (2.8)     17400  (2.8) 
       DEC MV3600 (vcc)                 14500  (2.7)     17400  (2.8) 
       HP9000/370                       15900  (2.9)     17300  (2.8) 
       IBM RT125 (afpa)                 13900  (2.6)     16000  (2.6) 
       HP9000/360                       13700  (2.5)     15200  (2.5) 
       DEC Vaxserver 3500               13200  (2.4)     15200  (2.5) 
       Dec Vaxstation 3100              13000  (2.4)     15100  (2.4) 
    >  Sun 386i/250 Weitek (cc)         14000  (2.6)     14800  (2.4) 
       Sun 3/60 (-O4 lib/f68881)        12900  (2.4)     14000  (2.3) 
    >  Sun 3/50 (gcc 68881)             10500  (1.9)     12700  (2.1) 
       IBM RT 135                       10600  (2.0)     11500  (1.9) 
       HP9000/350                       10500  (1.9)     11500  (1.9) 
       Sequent Symmetry                  9900  (1.8)     10500  (1.7)*
       Sun 3/60 (-f 68881)               8000  (1.5)      8750  (1.4) 
       386/25 + 387   (cc 386/ix)        7000  (1.3)      8200  (1.3) 
       HP9000/330 (HP-UX 6.5 cc)         7280  (1.3)      7910  (1.3) 
       IBM RT 125                        7200  (1.3)      7600  (1.2) 
       DEC Vaxstation 2000/vcc           5530  (1.0)      6330  (1.0) 
       HP9000/330                        5730  (1.1)      6230  (1.0) 
       386/25 + 387   (gcc)              6000  (1.1)      6200  (1.0) 
       DEC Vax 780                       5410  (1.0)      6170  (1.0) 
       HP9000/320                        5580  (1.0)      6150  (1.0) 
       Sun 3/50 (-f 68881)               5480  (1.0)      6080  (1.0) 
       DEC Vaxstation 2000               4670  (0.9)      5530  (0.9) 
       DEC MVII   (cc)                   4160  (0.8)      5210  (0.8) 
       DEC MVII   (vcc)                  4080  (0.8)      5070  (0.8) 
       Sun 3/60                          1960  (0.4)      2060  (0.3) 
       Sun 3/50                          1270  (0.2)      1330  (0.2) 
       Sun 3/160 (no fpa)                 ???  (???)      ~950  (0.2) 
    >  Sun 2/120 (no fpu - cc)            530  (0.1)       560  (0.1) 
       DEC Vax 730                            340  (0.1)       360  (0.1) 
       386/25 (386/ix - no 387)               259  (0.0)       260  (0.0) 

A few notes on the results: the Cray scc compiler uses the same backend
as their Fortran compiler.  gcc enhancements are due to inline code for 
sqrt and fabs. The three top Convex C2 measurements use compilers/libraries
that exploit the C2 hardware sqrt. It pays to shop around for the best 
compiler/options/libraries available for your floating point intensive code.

Thanks to:  bav@hobbes.ksu.ksu.edu, bryan%kewill@uunet.uu.net, bt@irfu.se, 
csmith@convex.com, csu@alembic.acs.com, eric@geology.tn.cornell, 
dave@rutgers.edu, evans@decvax.dec.com, glenn@mathcs.emory.edu, 
harrison@decwrl.dec.com, hleroy@erisa.fr, howard@aic.hrl.hac.com, 
hrp@boring.cray.com, idallen@watgcl.waterloo.edu, jpb@sn2024.cray.com, 
jw@pan.uucp, kline@ux1.cso.uiuc.edu,  markw@airgun.wg.waii.com, 
paul@db0tui66.bitnet, rauletta@gmuvax2.gmu.edu, skam@solbourne.com,
steved@longs.lance.colostate.edu, tac@csl.ncsu.edu, tpf@jdyx.uucp,
for sharing their results.

I would appreciate hearing about measurements on other clients, or results
differing significantly from those above.  To perform your own:

	1. Get xfroot/part01 (V5-I3) and xfroot/patch1(V5-I7) from
	   comp.sources.x. These are available via anonymous ftp from
	   uunet.uu.net. While they will eventually be found there in
	   comp.sources.x/volume5, as of this writing they are in
	   comp.sources.x/new/890924.0.Z and 890929.0. If you don't
	   have ftp access to uunet.uu.net, I will be happy to mail
	   a copy (~700 lines.)
	2. Install xfroot on the client to be tested, taking care
	   that you have verified the definition of HZ in xfroot.c.
	   (See the README.)
	3. Make the following two runs:

	      xfroot -a 0.1 -b 0.1 -c 0.1    (lower bound)
	      xfroot -a 3000 -b 3000 -c 3000 (upper bound)


-------------------------
Ed Kubaitis (ejk@ux1.cso.uiuc.edu)
Computing Services Office - University of Illinois, Urbana