Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uwm.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!ejk From: ejk@ux1.cso.uiuc.edu (Ed Kubaitis) Newsgroups: comp.arch Subject: yet-another-benchmark Message-ID: <1989Oct29.134631.28539@ux1.cso.uiuc.edu> Date: 29 Oct 89 13:46:31 GMT Sender: paul@ux1.cso.uiuc.edu (Paul Pomes) Reply-To: ejk@ux1.cso.uiuc.edu (Ed Kubaitis) Organization: University of Illinois at Urbana Lines: 150 Attached is yet-another-benchmark that might cast some light on aspects of architecture. As with all benchmarks, there is a very serious question of relevance to one's own applications. However, unlike many others, it is small enough to see in detail what is being measured. The numbers reported are trips/processor_second through the loop below. The calculation does not seem to lend itself to vector/parallel enhancements. int W, H, np, mxp, nP, mxP; double A, B, C; char *bmap; hopalong() { int wc=W/8, cx=W/2, cy=H/2, ix, iy; double x=0, y=0, xx, yy, t; while (np < mxp && ++nP < mxP) { t = sqrt(fabs(B*x-C)); xx = y - ( (x<0) ? t : -t ); yy = A - x; x = xx; y = yy; ix = cx + x; iy = cy + y; if (ix>-1 && iy>-1 && ix<(W-1) && iy<(H-1)) { bmap[iy*wc+(ix>>3)] |= 1<<(ix&7); np++; } } } It's building a bitmap of a fractal to display in an X root window. (Barry Martin algorithm published in A.K. Dewdney's "Computer Recreations" in the September 86 Scientific American.) ------------------------------------------------------------------------------- Newsgroups: comp.windows.x From: ejk@ux1.cso.uiuc.edu (Ed Kubaitis) Subject: xfroot timing update Date: Sun, 29 Oct 89 13:14:51 GMT Here is the 4th updated list of xfroot fractal-points/processor_second measured on various clients. The number, a count of trips/second through the 9 line "hopalong" loop in xfroot, is a rough index of scalar double-precision floating point uniprocessor speed. The lower number represents a case where nearly all points are in-range and thus require additional integer arithmetic, bit manipulation, and memory accesses to record the point. The higher number reflects a case when most points are out of range and most time is spent in floating point arithmetic. The numbers in parentheses are VAX 780 equivalents. "*" indicates values for a single processor. New items since the last posting are marked with ">". Cray 2 (scc) 304000 (56.2) 619000(100.3)* Cray Y-MP (scc) 316000 (58.4) 476000 (77.1)* Cray X-MP (scc) 283000 (52.3) 415000 (67.3)* Cray X-MP (cc) 157000 (29.0) 194000 (31.4)* Cray 2 (cc) 129000 (23.8) 183000 (29.7)* Convex C2 (gcc) 117000 (21.6) 151000 (24.5)* Convex C2 (vc3/fastmath) 108000 (20.0) 138000 (22.4)* Convex C2 (vc3) 99000 (18.3) 118000 (19.1)* DEC DS5800 95000 (17.6) 115000 (18.6)* HP9000/835CHX 66000 (12.2) 92000 (14.9) DEC DS5400 77000 (14.2) 91000 (14.7) DEC DS3100 58000 (10.7) 75000 (12.2) > Solbourne Series5 Cypress 58000 (10.7) 67000 (10.9) Gould NP1 44000 (8.1) 60000 (9.7)* DEC Vax 6400 (vcc) 50000 (9.2) 57000 (9.2) Convex C2 (vc2) 49000 (9.1) 55000 (8.9)* Convex C2 (cc) 41000 (7.6) 47000 (7.6)* Dec Vax 8650 28000 (5.2) 33000 (5.3) Sun Sparcstation 1 ~25000 (4.6) ??? (???) HP9000/370 (ffpa) 24000 (4.4) 28000 (4.5) Titan 22800 (4.2) 27100 (4.4) DEC MV3900 (vcc) 22900 (4.2) 26100 (4.2) DG AViiON (88k 16.7 MHz) 17200 (3.2) 24200 (3.9) Sun 4/260 21100 (3.9) 23600 (3.8) Dec Vax 8530 19700 (3.6) 23200 (3.8) Sun 4/280 ~21000 (3.9) ~23000 (3.7) Dec Vax 8600 19700 (3.6) 22400 (3.6) DEC Vax 6220 16800 (3.1) 19200 (3.1) DEC MV3200 (vcc) 15400 (2.8) 17500 (2.8) IBM RT 135 (-f2 -lfm) 15200 (2.8) 17400 (2.8) DEC MV3600 (vcc) 14500 (2.7) 17400 (2.8) HP9000/370 15900 (2.9) 17300 (2.8) IBM RT125 (afpa) 13900 (2.6) 16000 (2.6) HP9000/360 13700 (2.5) 15200 (2.5) DEC Vaxserver 3500 13200 (2.4) 15200 (2.5) Dec Vaxstation 3100 13000 (2.4) 15100 (2.4) > Sun 386i/250 Weitek (cc) 14000 (2.6) 14800 (2.4) Sun 3/60 (-O4 lib/f68881) 12900 (2.4) 14000 (2.3) > Sun 3/50 (gcc 68881) 10500 (1.9) 12700 (2.1) IBM RT 135 10600 (2.0) 11500 (1.9) HP9000/350 10500 (1.9) 11500 (1.9) Sequent Symmetry 9900 (1.8) 10500 (1.7)* Sun 3/60 (-f 68881) 8000 (1.5) 8750 (1.4) 386/25 + 387 (cc 386/ix) 7000 (1.3) 8200 (1.3) HP9000/330 (HP-UX 6.5 cc) 7280 (1.3) 7910 (1.3) IBM RT 125 7200 (1.3) 7600 (1.2) DEC Vaxstation 2000/vcc 5530 (1.0) 6330 (1.0) HP9000/330 5730 (1.1) 6230 (1.0) 386/25 + 387 (gcc) 6000 (1.1) 6200 (1.0) DEC Vax 780 5410 (1.0) 6170 (1.0) HP9000/320 5580 (1.0) 6150 (1.0) Sun 3/50 (-f 68881) 5480 (1.0) 6080 (1.0) DEC Vaxstation 2000 4670 (0.9) 5530 (0.9) DEC MVII (cc) 4160 (0.8) 5210 (0.8) DEC MVII (vcc) 4080 (0.8) 5070 (0.8) Sun 3/60 1960 (0.4) 2060 (0.3) Sun 3/50 1270 (0.2) 1330 (0.2) Sun 3/160 (no fpa) ??? (???) ~950 (0.2) > Sun 2/120 (no fpu - cc) 530 (0.1) 560 (0.1) DEC Vax 730 340 (0.1) 360 (0.1) 386/25 (386/ix - no 387) 259 (0.0) 260 (0.0) A few notes on the results: the Cray scc compiler uses the same backend as their Fortran compiler. gcc enhancements are due to inline code for sqrt and fabs. The three top Convex C2 measurements use compilers/libraries that exploit the C2 hardware sqrt. It pays to shop around for the best compiler/options/libraries available for your floating point intensive code. Thanks to: bav@hobbes.ksu.ksu.edu, bryan%kewill@uunet.uu.net, bt@irfu.se, csmith@convex.com, csu@alembic.acs.com, eric@geology.tn.cornell, dave@rutgers.edu, evans@decvax.dec.com, glenn@mathcs.emory.edu, harrison@decwrl.dec.com, hleroy@erisa.fr, howard@aic.hrl.hac.com, hrp@boring.cray.com, idallen@watgcl.waterloo.edu, jpb@sn2024.cray.com, jw@pan.uucp, kline@ux1.cso.uiuc.edu, markw@airgun.wg.waii.com, paul@db0tui66.bitnet, rauletta@gmuvax2.gmu.edu, skam@solbourne.com, steved@longs.lance.colostate.edu, tac@csl.ncsu.edu, tpf@jdyx.uucp, for sharing their results. I would appreciate hearing about measurements on other clients, or results differing significantly from those above. To perform your own: 1. Get xfroot/part01 (V5-I3) and xfroot/patch1(V5-I7) from comp.sources.x. These are available via anonymous ftp from uunet.uu.net. While they will eventually be found there in comp.sources.x/volume5, as of this writing they are in comp.sources.x/new/890924.0.Z and 890929.0. If you don't have ftp access to uunet.uu.net, I will be happy to mail a copy (~700 lines.) 2. Install xfroot on the client to be tested, taking care that you have verified the definition of HZ in xfroot.c. (See the README.) 3. Make the following two runs: xfroot -a 0.1 -b 0.1 -c 0.1 (lower bound) xfroot -a 3000 -b 3000 -c 3000 (upper bound) ------------------------- Ed Kubaitis (ejk@ux1.cso.uiuc.edu) Computing Services Office - University of Illinois, Urbana