Path: utzoo!attcan!uunet!samsung!usc!elroy.jpl.nasa.gov!ncar!gatech!udel!nigel.ee.udel.edu!mccalpin From: mccalpin@perelandra.cms.udel.edu (John D. McCalpin) Newsgroups: comp.unix.aix Subject: (LONG!) 64-bit Livermore Fortran Kernels Benchmark Results Message-ID: Date: 11 Nov 90 12:51:36 GMT Sender: usenet@ee.udel.edu Organization: College of Marine Studies, U. Del. Lines: 1055 Nntp-Posting-Host: perelandra.cms.udel.edu This message contains the complete output for the Livermore Fortran Kernels benchmark test running on an IBM RS/6000 Model 320 in DOUBLE precision (64-bit). The preceding note contains the same results for single-precision arithmetic (32-bit). Comments: --------- (1) The subroutine called "SIGNAL" supplied with the test must be renamed since its name conflicts with an IBM system library (causing the code to dump core). (2) The tests were run with the "MULTI" parameter set to 50 (instead of the default of 10) in order to get a long enough run to time accurately. (3) The function SECOND() was defined as: REAL FUNCTION SECOND(OLDSEC) SECOND=MCLOCK()*0.01-OLDSEC RETURN END Note that the value of 0.01 (100 ticks per second) is correct. The value of 60 ticks per second given in the IBM documentation is incorrect. (4) The column labeled "OK" in the output gives the number of significant figures of accuracy of the checksum for each test. IGNORE THIS COLUMN!!!! It is based on results for MULTI=10 and so is not correct for the case I ran (MULTI=50). (5) I did run the single and double-precision cases with MULTI=10 to check the checksums and got results in agreement with another IEEE machine (A Silicon Graphics 4D series box). For 32-bit arithmetic the checksums had typically 7-8 decimal digits of accuracy. Note that there are some obscure bugs (?) in the code that prevent the calculation of the checksum from being 64-bit accurate on a 32-bit machine when everything is declared double-precision. I assume that this is due to some implicit typecasts that I have not been able to find. In any case, I have verified that the code is correct by running it with the "-r8" flag on the Silicon Graphics machine, which sets default REAL precision to 64-bits in a fully consistent way. This gave accuracies of about 16 decimal digits. Since IBM does not currently provide an "auto-double" option on the xlf compiler, I was unable to reproduce these results on the RS/6000. (6) The code was compiled with the following command: xlf -O loops.f Some minor performance improvements may be obtainable through the use of other compiler options --- I have not tested these. (7) PLEASE NOTE that all of these tests are effectively cache- containable. Unless *your* applications are also cache-containable (or at least cache-friendly), you will not see the >20 MFLOPS performance levels shown here. On the other hand, certain carefully- coded subroutines (such as DGEMM in IBM's libblas.a) can run at over 30 MFLOPS on the Model 320 even for arrays much larger than cache. (8) Finally, here are the 64-bit results: ---------------------------------------------------------------------- verify adequate loop size versus cpu clock accuracy ----- ------- ------- ------- -------- extra maximum digital dynamic relative loop cputime clock clock timing size seconds error error error ----- ------- ------- ------- -------- 1 .0000E+00 100.00% 100.00% 100.00% 2 .0000E+00 100.00% 100.00% 100.00% 4 .1000E-01 .00% 264.57% 1888.35% 8 .0000E+00 100.00% 100.00% 100.00% 16 .1000E-01 .00% 264.57% 397.09% 32 .0000E+00 100.00% 100.00% 100.00% 64 .1000E-01 .00% 264.57% 24.27% 128 .1000E-01 .00% 173.21% 24.27% 256 .1000E-01 .00% 100.00% 24.27% 512 .1000E-01 .00% 57.74% 6.80% 1024 .2000E-01 .00% 29.79% .97% 2048 .4000E-01 .00% 13.32% .97% 4096 .7000E-01 .00% 7.69% .97% 6800 current run: multi= 50.000 ----- ------- ------- ------- -------- approximate serial job time= .4E+02 sec. ( nruns= 7 runs) trial= 1 chksum= 421 pass= 0 fail= 0 trial= 2 chksum= 421 pass= 1 fail= 0 trial= 3 chksum= 421 pass= 2 fail= 0 trial= 4 chksum= 421 pass= 3 fail= 0 trial= 5 chksum= 421 pass= 4 fail= 0 trial= 6 chksum= 421 pass= 5 fail= 0 trial= 7 chksum= 421 pass= 6 fail= 0 1 cpu clock overhead (t err): run average standev minimum maximum tick 1 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 2 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 3 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 4 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 5 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 6 .100000E-01 .000000E+00 .100000E-01 .100000E-01 tick 7 .000000E+00 .000000E+00 .000000E+00 .000000E+00 data 7 .999866E-01 .543323E-06 .999856E-01 .999877E-01 data 7 .999860E-01 .917423E-06 .999843E-01 .999877E-01 tick 7 .142857E-02 .349927E-02 .000000E+00 .100000E-01 the experimental timing errors for all 7 runs -- --------- --------- --------- ----- ----- --- k t min t avg t max t err tick p-f -- --------- --------- --------- ----- ----- --- 1 .1000E+00 .1057E+00 .1100E+00 4.68% 1.30% 0 2 .1100E+00 .1157E+00 .1200E+00 4.28% 1.19% 0 3 .4000E-01 .4286E-01 .5000E-01 10.54% 2.86% 0 4 .4000E-01 .4714E-01 .5000E-01 9.58% 2.86% 0 5 .9000E-01 .1000E+00 .1100E+00 5.35% 1.43% 0 6 .4000E-01 .5000E-01 .6000E-01 10.69% 2.86% 0 7 .1300E+00 .1400E+00 .1500E+00 3.82% 1.02% 0 8 .1300E+00 .1471E+00 .1500E+00 4.76% 1.02% 0 9 .1700E+00 .1771E+00 .2000E+00 5.82% .79% 0 10 .3600E+00 .3786E+00 .3900E+00 2.20% .39% 0 11 .7000E-01 .8286E-01 .9000E-01 8.45% 1.79% 0 12 .8000E-01 .9000E-01 .1000E+00 5.94% 1.59% 0 13 .7100E+00 .7229E+00 .7300E+00 .97% .20% 0 14 .5800E+00 .5900E+00 .6000E+00 .91% .24% 0 15 .2700E+00 .2786E+00 .2800E+00 1.26% .51% 0 16 .1700E+00 .1800E+00 .1900E+00 4.20% .79% 0 17 .2300E+00 .2386E+00 .2400E+00 1.47% .60% 0 18 .2100E+00 .2186E+00 .2200E+00 1.60% .65% 0 19 .1100E+00 .1200E+00 .1300E+00 4.45% 1.19% 0 20 .2700E+00 .2786E+00 .2800E+00 1.26% .51% 0 21 .1680E+01 .1689E+01 .1710E+01 .59% .08% 0 22 .2600E+00 .2714E+00 .2800E+00 2.35% .51% 0 23 .1700E+00 .1771E+00 .1800E+00 2.55% .79% 0 24 .1600E+00 .1657E+00 .1700E+00 2.99% .84% 0 -- --------- --------- --------- ----- ----- --- net cpu timing variance (t err); a few % is ok: average standev minimum maximum terr 4.19% 2.99% .59% 10.69% 1 ******************************************** the livermore fortran kernels: m f l o p s ******************************************** computer : IBM RS/6000 Model 320 system : AIX 3.1, 64-bit, 20 MHz compiler : xlf -O (v1.1) 32kB cache date : 11/02/90 mean do span = 471 when the computer performance range is very large the net mflops rate of many fortran programs and workloads will be in the sub-range between the equi- weighted harmonic and arithmetic means depending on the degree of code parallelism and optimization. the least biased central measure is the geometric mean of 72 rates, quoted +- a standard deviation. kernel flops microsec mflop/sec span weight check-sums ok ------ ----- -------- --------- ---- ------ ---------------------- -- 1 .1752E+07 .1057E+06 16.5706 1001 1.00 .3580257008721524E+06 7 2 .1300E+07 .1157E+06 11.2328 101 1.00 .3605241763137281E+04 8 3 .9009E+06 .4286E+05 21.0210 1001 1.00 .7005200179428446E+02 8 4 .8400E+06 .4714E+05 17.8182 1001 1.00 .4199475392699242E+01 8 5 .1000E+07 .1000E+06 10.0000 1001 1.00 .3184210149404855E+05 8 6 .5952E+06 .5000E+05 11.9040 64 1.00 .2288741318728983E+26 0 7 .3184E+07 .1400E+06 22.7429 995 1.00 .4272975754667242E+06 8 8 .3564E+07 .1471E+06 24.2214 100 1.00 .1050887604075329E+07 8 9 .3091E+07 .1771E+06 17.4469 101 1.00 .8326105274034671E+06 8 10 .1545E+07 .3786E+06 4.0819 101 1.00 .5117258847672790E+06 8 11 .5500E+06 .8286E+05 6.6379 1001 1.00 .2340037680671110E+09 8 12 .6000E+06 .9000E+05 6.6667 1000 1.00 .2126321196556091E-03 1 13 .8064E+06 .7229E+06 1.1156 64 1.00 .1552259922630282E+12 0 14 .1101E+07 .5900E+06 1.8663 1001 1.00 .2087503090558357E+11 4 15 .8250E+06 .2786E+06 2.9615 101 1.00 .2760671680302306E+06 8 16 .6625E+06 .1800E+06 3.6806 75 1.00 .9892820000000000E+06 0 17 .1591E+07 .2386E+06 6.6678 101 1.00 .7802492407439027E+04 8 18 .2178E+07 .2186E+06 9.9647 100 1.00 .4342684487413166E+06 1 19 .1182E+07 .1200E+06 9.8475 101 1.00 .3795271869848748E+04 8 20 .1300E+07 .2786E+06 4.6667 1000 1.00 .2128450971917007E+09 8 21 .6312E+07 .1689E+07 3.7384 101 1.00 .2812029997465708E+09 0 22 .9444E+06 .2714E+06 3.4792 101 1.00 .2057023067602647E+04 8 23 .2178E+07 .1771E+06 12.2952 100 1.00 .2484930351216168E+06 5 24 .2500E+06 .1657E+06 1.5086 1001 1.00 .3500000000000000E+04 8 ------ ----- -------- --------- ---- ------ ---------------------- -- 24 .3825E+08 .6407E+07 5.9702 471 122 mflops range: report all range statistics: maximum rate = 24.2214 mega-flops/sec. quartile q3 = 14.4329 mega-flops/sec. average rate = 9.6723 mega-flops/sec. geometric mean = 7.0671 mega-flops/sec. median q2 = 8.2577 mega-flops/sec. harmonic mean = 4.7693 mega-flops/sec. quartile q1 = 3.7095 mega-flops/sec. minimum rate = 1.1156 mega-flops/sec. standard dev. = 6.8893 mega-flops/sec. geom.mean dev. = 7.3654 mega-flops/sec. mean precision = 5.08 decimal digits 1 sensitivity analysis the sensitivity of the harmonic mean rate (mflops) to various weightings is shown in the table below. seven work distributions are generated by assigning two distinct weights to ranked kernels by quartiles. forty nine possible cpu workloads are then evaluated using seven sets of values for the total weights: ------ ------ ------ ------ ------ ------ ------ 1st qt: o o o o o x x 2nd qt: o o o x x x o 3rd qt: o x x x o o o 4th qt: x x o o o o o ------ ------ ------ ------ ------ ------ ------ total weights net mflops: x o ---- ---- 1.00 .00 2.01 2.88 5.10 6.92 10.79 13.91 19.57 .95 .05 2.09 3.00 5.07 6.62 9.95 11.67 16.21 .90 .10 2.17 3.13 5.05 6.35 9.23 10.05 13.84 .80 .20 2.37 3.42 5.00 5.86 8.07 7.87 10.71 .70 .30 2.61 3.78 4.96 5.45 7.17 6.47 8.73 .60 .40 2.90 4.22 4.92 5.09 6.45 5.49 7.37 .50 .50 3.27 4.77 4.87 4.77 5.86 4.77 6.38 ---- ---- ------ ------ ------ ------ ------ ------ ------ sensitivity of net mflops rate to use of optimal fortran code(sisd/simd model) 2.88 3.47 4.37 5.89 7.14 9.06 12.39 15.17 19.57 .00 .20 .40 .60 .70 .80 .90 .95 1.00 fraction of operations run at optimal fortran rates 1 cpu clock overhead (t err): run average standev minimum maximum tick 1 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 2 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 3 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 4 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 5 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 6 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 7 .000000E+00 .000000E+00 .000000E+00 .000000E+00 data 7 .999866E-01 .543323E-06 .999856E-01 .999877E-01 data 7 .999860E-01 .917423E-06 .999843E-01 .999877E-01 tick 7 .000000E+00 .000000E+00 .000000E+00 .000000E+00 the experimental timing errors for all 7 runs -- --------- --------- --------- ----- ----- --- k t min t avg t max t err tick p-f -- --------- --------- --------- ----- ----- --- 1 .1200E+00 .1243E+00 .1300E+00 3.98% .00% 0 2 .1400E+00 .1414E+00 .1500E+00 2.47% .00% 0 3 .6000E-01 .6000E-01 .6000E-01 .00% .00% 0 4 .8000E-01 .8143E-01 .9000E-01 4.30% .00% 0 5 .1100E+00 .1143E+00 .1200E+00 4.33% .00% 0 6 .6000E-01 .6571E-01 .7000E-01 7.53% .00% 0 7 .1600E+00 .1600E+00 .1600E+00 .00% .00% 0 8 .1700E+00 .1786E+00 .1800E+00 1.96% .00% 0 9 .2000E+00 .2071E+00 .2200E+00 3.38% .00% 0 10 .4200E+00 .4271E+00 .4300E+00 1.06% .00% 0 11 .1000E+00 .1000E+00 .1000E+00 .00% .00% 0 12 .1000E+00 .1057E+00 .1100E+00 4.68% .00% 0 13 .8200E+00 .8257E+00 .8300E+00 .60% .00% 0 14 .5100E+00 .5157E+00 .5200E+00 .96% .00% 0 15 .5500E+00 .5571E+00 .5600E+00 .81% .00% 0 16 .2100E+00 .2157E+00 .2200E+00 2.29% .00% 0 17 .2700E+00 .2714E+00 .2800E+00 1.29% .00% 0 18 .2200E+00 .2200E+00 .2200E+00 .00% .00% 0 19 .1400E+00 .1414E+00 .1500E+00 2.47% .00% 0 20 .4300E+00 .4386E+00 .4400E+00 .80% .00% 0 21 .5000E+00 .5086E+00 .5100E+00 .69% .00% 0 22 .3400E+00 .3471E+00 .3500E+00 1.30% .00% 0 23 .2200E+00 .2243E+00 .2300E+00 2.21% .00% 0 24 .2000E+00 .2057E+00 .2100E+00 2.41% .00% 0 -- --------- --------- --------- ----- ----- --- net cpu timing variance (t err); a few % is ok: average standev minimum maximum terr 2.06% 1.83% .00% 7.53% 1 ******************************************** the livermore fortran kernels: m f l o p s ******************************************** computer : IBM RS/6000 Model 320 system : AIX 3.1, 64-bit, 20 MHz compiler : xlf -O (v1.1) 32kB cache date : 11/02/90 mean do span = 90 when the computer performance range is very large the net mflops rate of many fortran programs and workloads will be in the sub-range between the equi- weighted harmonic and arithmetic means depending on the degree of code parallelism and optimization. the least biased central measure is the geometric mean of 72 rates, quoted +- a standard deviation. kernel flops microsec mflop/sec span weight check-sums ok ------ ----- -------- --------- ---- ------ ---------------------- -- 1 .2020E+07 .1243E+06 16.2529 101 2.00 .3677341471833584E+04 7 2 .1552E+07 .1414E+06 10.9737 101 2.00 .3605241763137281E+04 8 3 .1071E+07 .6000E+05 17.8433 101 2.00 .7068190058985076E+01 8 4 .8400E+06 .8143E+05 10.3158 101 2.00 .4199475392699242E+01 8 5 .1100E+07 .1143E+06 9.6250 101 2.00 .3212322355798968E+03 8 6 .6720E+06 .6571E+05 10.2261 32 2.00 .1960856598681209E+30 0 7 .3555E+07 .1600E+06 22.2200 101 2.00 .4441910425224146E+04 8 8 .4277E+07 .1786E+06 23.9501 100 2.00 .1050887604075329E+07 8 9 .3606E+07 .2071E+06 17.4068 101 2.00 .8326105274034671E+06 8 10 .1727E+07 .4271E+06 4.0434 101 2.00 .5117258847672790E+06 8 11 .6400E+06 .1000E+06 6.4000 101 2.00 .2403492284702882E+06 8 12 .6800E+06 .1057E+06 6.4324 100 2.00 .4923343658447266E-04 2 13 .9184E+06 .8257E+06 1.1122 32 2.00 .1001353007004402E+12 0 14 .1111E+07 .5157E+06 2.1543 101 2.00 .2139450190318256E+09 2 15 .1650E+07 .5571E+06 2.9615 101 2.00 .2760671680302306E+06 8 16 .7560E+06 .2157E+06 3.5046 40 2.00 .1134287000000000E+07 0 17 .1818E+07 .2714E+06 6.6979 101 2.00 .7802492407439027E+04 8 18 .2178E+07 .2200E+06 9.9000 100 2.00 .4342684487413166E+06 1 19 .1394E+07 .1414E+06 9.8552 101 2.00 .3795271869848748E+04 8 20 .2080E+07 .4386E+06 4.7427 100 2.00 .2188343542345499E+06 7 21 .6250E+07 .5086E+06 12.2893 50 2.00 .1373396176759844E+09 0 22 .1202E+07 .3471E+06 3.4623 101 2.00 .2057023067602647E+04 8 23 .2722E+07 .2243E+06 12.1385 100 2.00 .2484930351216168E+06 6 24 .3100E+06 .2057E+06 1.5069 101 2.00 .3500000000000000E+03 8 ------ ----- -------- --------- ---- ------ ---------------------- -- 24 .4413E+08 .6237E+07 7.0752 90 121 mflops range: report all range statistics: maximum rate = 23.9501 mega-flops/sec. quartile q3 = 12.2139 mega-flops/sec. average rate = 9.4173 mega-flops/sec. geometric mean = 7.1309 mega-flops/sec. median q2 = 9.7401 mega-flops/sec. harmonic mean = 4.9224 mega-flops/sec. quartile q1 = 3.7740 mega-flops/sec. minimum rate = 1.1122 mega-flops/sec. standard dev. = 6.2889 mega-flops/sec. geom.mean dev. = 6.6916 mega-flops/sec. mean precision = 5.04 decimal digits 1 sensitivity analysis the sensitivity of the harmonic mean rate (mflops) to various weightings is shown in the table below. seven work distributions are generated by assigning two distinct weights to ranked kernels by quartiles. forty nine possible cpu workloads are then evaluated using seven sets of values for the total weights: ------ ------ ------ ------ ------ ------ ------ 1st qt: o o o o o x x 2nd qt: o o o x x x o 3rd qt: o x x x o o o 4th qt: x x o o o o o ------ ------ ------ ------ ------ ------ ------ total weights net mflops: x o ---- ---- 1.00 .00 2.04 3.03 5.86 7.53 10.51 13.13 17.49 .95 .05 2.12 3.15 5.79 7.15 9.77 11.25 14.94 .90 .10 2.21 3.28 5.72 6.81 9.13 9.85 13.05 .80 .20 2.42 3.58 5.58 6.21 8.07 7.88 10.40 .70 .30 2.67 3.94 5.45 5.71 7.23 6.56 8.65 .60 .40 2.97 4.38 5.32 5.29 6.55 5.63 7.41 .50 .50 3.35 4.92 5.20 4.92 5.98 4.92 6.47 ---- ---- ------ ------ ------ ------ ------ ------ ------ sensitivity of net mflops rate to use of optimal fortran code(sisd/simd model) 3.03 3.63 4.53 6.01 7.19 8.95 11.84 14.12 17.49 .00 .20 .40 .60 .70 .80 .90 .95 1.00 fraction of operations run at optimal fortran rates 1 cpu clock overhead (t err): run average standev minimum maximum tick 1 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 2 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 3 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 4 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 5 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 6 .000000E+00 .000000E+00 .000000E+00 .000000E+00 tick 7 .000000E+00 .000000E+00 .000000E+00 .000000E+00 data 7 .999866E-01 .543323E-06 .999856E-01 .999877E-01 data 7 .999860E-01 .917423E-06 .999843E-01 .999877E-01 tick 7 .000000E+00 .000000E+00 .000000E+00 .000000E+00 the experimental timing errors for all 7 runs -- --------- --------- --------- ----- ----- --- k t min t avg t max t err tick p-f -- --------- --------- --------- ----- ----- --- 1 .9000E-01 .9429E-01 .1000E+00 5.25% .00% 0 2 .1100E+00 .1171E+00 .1200E+00 3.86% .00% 0 3 .4000E-01 .4857E-01 .5000E-01 7.20% .00% 0 4 .1000E+00 .1071E+00 .1100E+00 4.22% .00% 0 5 .8000E-01 .8286E-01 .9000E-01 5.45% .00% 0 6 .7000E-01 .7429E-01 .8000E-01 6.66% .00% 0 7 .1200E+00 .1229E+00 .1300E+00 3.68% .00% 0 8 .1300E+00 .1414E+00 .1500E+00 4.52% .00% 0 9 .1500E+00 .1514E+00 .1600E+00 2.31% .00% 0 10 .2200E+00 .2257E+00 .2300E+00 2.19% .00% 0 11 .8000E-01 .8143E-01 .9000E-01 4.30% .00% 0 12 .8000E-01 .8286E-01 .9000E-01 5.45% .00% 0 13 .6200E+00 .6271E+00 .6300E+00 .72% .00% 0 14 .3900E+00 .3914E+00 .4000E+00 .89% .00% 0 15 .2900E+00 .2986E+00 .3000E+00 1.17% .00% 0 16 .1600E+00 .1657E+00 .1700E+00 2.99% .00% 0 17 .1900E+00 .2014E+00 .2100E+00 3.17% .00% 0 18 .2100E+00 .2171E+00 .2200E+00 2.08% .00% 0 19 .1100E+00 .1100E+00 .1100E+00 .00% .00% 0 20 .3700E+00 .3857E+00 .3900E+00 1.89% .00% 0 21 .8600E+00 .8757E+00 .8800E+00 .83% .00% 0 22 .2300E+00 .2371E+00 .2400E+00 1.90% .00% 0 23 .1600E+00 .1614E+00 .1700E+00 2.17% .00% 0 24 .1600E+00 .1600E+00 .1600E+00 .00% .00% 0 -- --------- --------- --------- ----- ----- --- net cpu timing variance (t err); a few % is ok: average standev minimum maximum terr 3.04% 2.00% .00% 7.20% 1 ******************************************** the livermore fortran kernels: m f l o p s ******************************************** computer : IBM RS/6000 Model 320 system : AIX 3.1, 64-bit, 20 MHz compiler : xlf -O (v1.1) 32kB cache date : 11/02/90 mean do span = 19 when the computer performance range is very large the net mflops rate of many fortran programs and workloads will be in the sub-range between the equi- weighted harmonic and arithmetic means depending on the degree of code parallelism and optimization. the least biased central measure is the geometric mean of 72 rates, quoted +- a standard deviation. kernel flops microsec mflop/sec span weight check-sums ok ------ ----- -------- --------- ---- ------ ---------------------- -- 1 .1512E+07 .9429E+05 16.0364 27 1.00 .2698573244316686E+03 7 2 .8096E+06 .1171E+06 6.9112 15 1.00 .8398933282494545E+02 8 3 .7992E+06 .4857E+05 16.4541 27 1.00 .1889516363539674E+01 8 4 .4560E+06 .1071E+06 4.2560 27 1.00 .4199475392699242E+01 8 5 .8320E+06 .8286E+05 10.0414 27 1.00 .2227830674507612E+02 8 6 .4032E+06 .7429E+05 5.4277 8 1.00 .3503988743573140E+18 0 7 .2688E+07 .1229E+06 21.8791 21 1.00 .1992004156003820E+03 8 8 .3370E+07 .1414E+06 23.8255 14 1.00 .2072380565635115E+05 8 9 .2652E+07 .1514E+06 17.5132 15 1.00 .1836777921157486E+05 8 10 .1350E+07 .2257E+06 5.9810 15 1.00 .1155903858502209E+05 8 11 .4784E+06 .8143E+05 5.8751 27 1.00 .4585812909752131E+04 8 12 .4992E+06 .8286E+05 6.0248 26 1.00 .1356005668640137E-04 2 13 .6944E+06 .6271E+06 1.1072 8 1.00 .2729559141139016E+11 0 14 .9504E+06 .3914E+06 2.4280 27 1.00 .1851535164869094E+08 1 15 .9240E+06 .2986E+06 3.0947 15 1.00 .7762980993917696E+04 8 16 .6160E+06 .1657E+06 3.7172 15 1.00 .9017120000000000E+06 0 17 .1404E+07 .2014E+06 6.9702 15 1.00 .2063158045128956E+03 8 18 .2288E+07 .2171E+06 10.5368 14 1.00 .6790452381066978E+04 8 19 .1008E+07 .1100E+06 9.1636 15 1.00 .8877614968312321E+02 8 20 .1893E+07 .3857E+06 4.9073 26 1.00 .4191399189884072E+04 8 21 .1000E+08 .8757E+06 11.4192 20 1.00 .8773979719923909E+08 0 22 .8160E+06 .2371E+06 3.4410 15 1.00 .4276978164173358E+02 8 23 .2002E+07 .1614E+06 12.4018 14 1.00 .3395238412030041E+04 8 24 .2392E+06 .1600E+06 1.4950 27 1.00 .9100000000000000E+02 8 ------ ----- -------- --------- ---- ------ ---------------------- -- 24 .3868E+08 .5161E+07 7.4948 19 130 mflops range: report all range statistics: maximum rate = 23.8255 mega-flops/sec. quartile q3 = 11.9105 mega-flops/sec. average rate = 8.7878 mega-flops/sec. geometric mean = 6.6733 mega-flops/sec. median q2 = 6.4680 mega-flops/sec. harmonic mean = 4.7800 mega-flops/sec. quartile q1 = 3.9866 mega-flops/sec. minimum rate = 1.1072 mega-flops/sec. standard dev. = 6.2141 mega-flops/sec. geom.mean dev. = 6.5640 mega-flops/sec. mean precision = 5.42 decimal digits 1 sensitivity analysis the sensitivity of the harmonic mean rate (mflops) to various weightings is shown in the table below. seven work distributions are generated by assigning two distinct weights to ranked kernels by quartiles. forty nine possible cpu workloads are then evaluated using seven sets of values for the total weights: ------ ------ ------ ------ ------ ------ ------ 1st qt: o o o o o x x 2nd qt: o o o x x x o 3rd qt: o x x x o o o 4th qt: x x o o o o o ------ ------ ------ ------ ------ ------ ------ total weights net mflops: x o ---- ---- 1.00 .00 2.09 3.01 5.33 6.65 8.83 11.67 17.21 .95 .05 2.17 3.12 5.29 6.40 8.36 10.20 14.67 .90 .10 2.26 3.25 5.25 6.16 7.94 9.06 12.78 .80 .20 2.46 3.53 5.17 5.75 7.20 7.40 10.16 .70 .30 2.70 3.87 5.09 5.38 6.60 6.26 8.44 .60 .40 2.99 4.28 5.02 5.06 6.08 5.42 7.21 .50 .50 3.35 4.78 4.95 4.78 5.64 4.78 6.30 ---- ---- ------ ------ ------ ------ ------ ------ ------ sensitivity of net mflops rate to use of optimal fortran code(sisd/simd model) 3.01 3.60 4.49 5.95 7.12 8.85 11.69 13.92 17.21 .00 .20 .40 .60 .70 .80 .90 .95 1.00 fraction of operations run at optimal fortran rates 1 ******************************************** the livermore fortran kernels: * summary * ******************************************** computer : IBM RS/6000 Model 320 system : AIX 3.1, 64-bit, 20 MHz compiler : xlf -O (v1.1) 32kB cache date : 11/02/90 mean do span = 167 when the computer performance range is very large the net mflops rate of many fortran programs and workloads will be in the sub-range between the equi- weighted harmonic and arithmetic means depending on the degree of code parallelism and optimization. the least biased central measure is the geometric mean of 72 rates, quoted +- a standard deviation. kernel flops microsec mflop/sec span weight check-sums ok ------ ----- -------- --------- ---- ------ ---------------------- -- 1 .1512E+07 .9429E+05 16.0364 27 1.00 .2698573244316686E+03 7 2 .8096E+06 .1171E+06 6.9112 15 1.00 .8398933282494545E+02 8 3 .7992E+06 .4857E+05 16.4541 27 1.00 .1889516363539674E+01 8 4 .4560E+06 .1071E+06 4.2560 27 1.00 .4199475392699242E+01 8 5 .8320E+06 .8286E+05 10.0414 27 1.00 .2227830674507612E+02 8 6 .4032E+06 .7429E+05 5.4277 8 1.00 .3503988743573140E+18 0 7 .2688E+07 .1229E+06 21.8791 21 1.00 .1992004156003820E+03 8 8 .3370E+07 .1414E+06 23.8255 14 1.00 .2072380565635115E+05 8 9 .2652E+07 .1514E+06 17.5132 15 1.00 .1836777921157486E+05 8 10 .1350E+07 .2257E+06 5.9810 15 1.00 .1155903858502209E+05 8 11 .4784E+06 .8143E+05 5.8751 27 1.00 .4585812909752131E+04 8 12 .4992E+06 .8286E+05 6.0248 26 1.00 .1356005668640137E-04 2 13 .6944E+06 .6271E+06 1.1072 8 1.00 .2729559141139016E+11 0 14 .9504E+06 .3914E+06 2.4280 27 1.00 .1851535164869094E+08 1 15 .9240E+06 .2986E+06 3.0947 15 1.00 .7762980993917696E+04 8 16 .6160E+06 .1657E+06 3.7172 15 1.00 .9017120000000000E+06 0 17 .1404E+07 .2014E+06 6.9702 15 1.00 .2063158045128956E+03 8 18 .2288E+07 .2171E+06 10.5368 14 1.00 .6790452381066978E+04 8 19 .1008E+07 .1100E+06 9.1636 15 1.00 .8877614968312321E+02 8 20 .1893E+07 .3857E+06 4.9073 26 1.00 .4191399189884072E+04 8 21 .1000E+08 .8757E+06 11.4192 20 1.00 .8773979719923909E+08 0 22 .8160E+06 .2371E+06 3.4410 15 1.00 .4276978164173358E+02 8 23 .2002E+07 .1614E+06 12.4018 14 1.00 .3395238412030041E+04 8 24 .2392E+06 .1600E+06 1.4950 27 1.00 .9100000000000000E+02 8 1 .2020E+07 .1243E+06 16.2529 101 2.00 .3677341471833584E+04 7 2 .1552E+07 .1414E+06 10.9737 101 2.00 .3605241763137281E+04 8 3 .1071E+07 .6000E+05 17.8433 101 2.00 .7068190058985076E+01 8 4 .8400E+06 .8143E+05 10.3158 101 2.00 .4199475392699242E+01 8 5 .1100E+07 .1143E+06 9.6250 101 2.00 .3212322355798968E+03 8 6 .6720E+06 .6571E+05 10.2261 32 2.00 .1960856598681209E+30 0 7 .3555E+07 .1600E+06 22.2200 101 2.00 .4441910425224146E+04 8 8 .4277E+07 .1786E+06 23.9501 100 2.00 .1050887604075329E+07 8 9 .3606E+07 .2071E+06 17.4068 101 2.00 .8326105274034671E+06 8 10 .1727E+07 .4271E+06 4.0434 101 2.00 .5117258847672790E+06 8 11 .6400E+06 .1000E+06 6.4000 101 2.00 .2403492284702882E+06 8 12 .6800E+06 .1057E+06 6.4324 100 2.00 .4923343658447266E-04 2 13 .9184E+06 .8257E+06 1.1122 32 2.00 .1001353007004402E+12 0 14 .1111E+07 .5157E+06 2.1543 101 2.00 .2139450190318256E+09 2 15 .1650E+07 .5571E+06 2.9615 101 2.00 .2760671680302306E+06 8 16 .7560E+06 .2157E+06 3.5046 40 2.00 .1134287000000000E+07 0 17 .1818E+07 .2714E+06 6.6979 101 2.00 .7802492407439027E+04 8 18 .2178E+07 .2200E+06 9.9000 100 2.00 .4342684487413166E+06 1 19 .1394E+07 .1414E+06 9.8552 101 2.00 .3795271869848748E+04 8 20 .2080E+07 .4386E+06 4.7427 100 2.00 .2188343542345499E+06 7 21 .6250E+07 .5086E+06 12.2893 50 2.00 .1373396176759844E+09 0 22 .1202E+07 .3471E+06 3.4623 101 2.00 .2057023067602647E+04 8 23 .2722E+07 .2243E+06 12.1385 100 2.00 .2484930351216168E+06 6 24 .3100E+06 .2057E+06 1.5069 101 2.00 .3500000000000000E+03 8 1 .1752E+07 .1057E+06 16.5706 1001 1.00 .3580257008721524E+06 7 2 .1300E+07 .1157E+06 11.2328 101 1.00 .3605241763137281E+04 8 3 .9009E+06 .4286E+05 21.0210 1001 1.00 .7005200179428446E+02 8 4 .8400E+06 .4714E+05 17.8182 1001 1.00 .4199475392699242E+01 8 5 .1000E+07 .1000E+06 10.0000 1001 1.00 .3184210149404855E+05 8 6 .5952E+06 .5000E+05 11.9040 64 1.00 .2288741318728983E+26 0 7 .3184E+07 .1400E+06 22.7429 995 1.00 .4272975754667242E+06 8 8 .3564E+07 .1471E+06 24.2214 100 1.00 .1050887604075329E+07 8 9 .3091E+07 .1771E+06 17.4469 101 1.00 .8326105274034671E+06 8 10 .1545E+07 .3786E+06 4.0819 101 1.00 .5117258847672790E+06 8 11 .5500E+06 .8286E+05 6.6379 1001 1.00 .2340037680671110E+09 8 12 .6000E+06 .9000E+05 6.6667 1000 1.00 .2126321196556091E-03 1 13 .8064E+06 .7229E+06 1.1156 64 1.00 .1552259922630282E+12 0 14 .1101E+07 .5900E+06 1.8663 1001 1.00 .2087503090558357E+11 4 15 .8250E+06 .2786E+06 2.9615 101 1.00 .2760671680302306E+06 8 16 .6625E+06 .1800E+06 3.6806 75 1.00 .9892820000000000E+06 0 17 .1591E+07 .2386E+06 6.6678 101 1.00 .7802492407439027E+04 8 18 .2178E+07 .2186E+06 9.9647 100 1.00 .4342684487413166E+06 1 19 .1182E+07 .1200E+06 9.8475 101 1.00 .3795271869848748E+04 8 20 .1300E+07 .2786E+06 4.6667 1000 1.00 .2128450971917007E+09 8 21 .6312E+07 .1689E+07 3.7384 101 1.00 .2812029997465708E+09 0 22 .9444E+06 .2714E+06 3.4792 101 1.00 .2057023067602647E+04 8 23 .2178E+07 .1771E+06 12.2952 100 1.00 .2484930351216168E+06 5 24 .2500E+06 .1657E+06 1.5086 1001 1.00 .3500000000000000E+04 8 ------ ----- -------- --------- ---- ------ ---------------------- -- 72 .1211E+09 .1781E+08 6.7992 167 373 mflops range: report all range statistics: maximum rate = 24.2214 mega-flops/sec. quartile q3 = 12.2139 mega-flops/sec. average rate = 9.3237 mega-flops/sec. geometric mean = 6.9979 mega-flops/sec. median q2 = 6.9702 mega-flops/sec. harmonic mean = 4.8474 mega-flops/sec. quartile q1 = 3.7172 mega-flops/sec. minimum rate = 1.1072 mega-flops/sec. standard dev. = 6.4344 mega-flops/sec. geom.mean dev. = 6.8418 mega-flops/sec. mean precision = 5.18 decimal digits 1 top quartile: best architecture/application match kernel flops microsec mflop/sec span weight ------ ----- -------- --------- ---- ------ 8 .3564E+07 .1471E+06 24.2214 100 1.00 8 .4277E+07 .1786E+06 23.9501 100 2.00 8 .3370E+07 .1414E+06 23.8255 14 1.00 7 .3184E+07 .1400E+06 22.7429 995 1.00 7 .3555E+07 .1600E+06 22.2200 101 2.00 7 .2688E+07 .1229E+06 21.8791 21 1.00 3 .9009E+06 .4286E+05 21.0210 1001 1.00 3 .1071E+07 .6000E+05 17.8433 101 2.00 4 .8400E+06 .4714E+05 17.8182 1001 1.00 9 .2652E+07 .1514E+06 17.5132 15 1.00 9 .3091E+07 .1771E+06 17.4469 101 1.00 9 .3606E+07 .2071E+06 17.4068 101 2.00 1 .1752E+07 .1057E+06 16.5706 1001 1.00 3 .7992E+06 .4857E+05 16.4541 27 1.00 1 .2020E+07 .1243E+06 16.2529 101 2.00 1 .1512E+07 .9429E+05 16.0364 27 1.00 23 .2002E+07 .1614E+06 12.4018 14 1.00 23 .2178E+07 .1771E+06 12.2952 100 1.00 ------ ----- -------- --------- ---- ------ frac. weights = .2396 average rate = 18.9379 mega-flops/sec. harmonic mean = 18.2533 mega-flops/sec. standard dev. = 3.5208 mega-flops/sec. kernel flops microsec mflop/sec span weight ------ ----- -------- --------- ---- ------ 21 .6250E+07 .5086E+06 12.2893 50 2.00 23 .2722E+07 .2243E+06 12.1385 100 2.00 6 .5952E+06 .5000E+05 11.9040 64 1.00 21 .1000E+08 .8757E+06 11.4192 20 1.00 2 .1300E+07 .1157E+06 11.2328 101 1.00 2 .1552E+07 .1414E+06 10.9737 101 2.00 18 .2288E+07 .2171E+06 10.5368 14 1.00 4 .8400E+06 .8143E+05 10.3158 101 2.00 6 .6720E+06 .6571E+05 10.2261 32 2.00 5 .8320E+06 .8286E+05 10.0414 27 1.00 5 .1000E+07 .1000E+06 10.0000 1001 1.00 18 .2178E+07 .2186E+06 9.9647 100 1.00 18 .2178E+07 .2200E+06 9.9000 100 2.00 19 .1394E+07 .1414E+06 9.8552 101 2.00 19 .1182E+07 .1200E+06 9.8475 101 1.00 5 .1100E+07 .1143E+06 9.6250 101 2.00 19 .1008E+07 .1100E+06 9.1636 15 1.00 17 .1404E+07 .2014E+06 6.9702 15 1.00 2 .8096E+06 .1171E+06 6.9112 15 1.00 17 .1818E+07 .2714E+06 6.6979 101 2.00 17 .1591E+07 .2386E+06 6.6678 101 1.00 12 .6000E+06 .9000E+05 6.6667 1000 1.00 11 .5500E+06 .8286E+05 6.6379 1001 1.00 12 .6800E+06 .1057E+06 6.4324 100 2.00 11 .6400E+06 .1000E+06 6.4000 101 2.00 12 .4992E+06 .8286E+05 6.0248 26 1.00 10 .1350E+07 .2257E+06 5.9810 15 1.00 11 .4784E+06 .8143E+05 5.8751 27 1.00 6 .4032E+06 .7429E+05 5.4277 8 1.00 20 .1893E+07 .3857E+06 4.9073 26 1.00 20 .2080E+07 .4386E+06 4.7427 100 2.00 20 .1300E+07 .2786E+06 4.6667 1000 1.00 4 .4560E+06 .1071E+06 4.2560 27 1.00 10 .1545E+07 .3786E+06 4.0819 101 1.00 10 .1727E+07 .4271E+06 4.0434 101 2.00 21 .6312E+07 .1689E+07 3.7384 101 1.00 ------ ----- -------- --------- ---- ------ frac. weights = .5104 average rate = 8.1674 mega-flops/sec. harmonic mean = 7.1970 mega-flops/sec. standard dev. = 2.6685 mega-flops/sec. kernel flops microsec mflop/sec span weight ------ ----- -------- --------- ---- ------ 16 .6160E+06 .1657E+06 3.7172 15 1.00 16 .6625E+06 .1800E+06 3.6806 75 1.00 16 .7560E+06 .2157E+06 3.5046 40 2.00 22 .9444E+06 .2714E+06 3.4792 101 1.00 22 .1202E+07 .3471E+06 3.4623 101 2.00 22 .8160E+06 .2371E+06 3.4410 15 1.00 15 .9240E+06 .2986E+06 3.0947 15 1.00 15 .8250E+06 .2786E+06 2.9615 101 1.00 15 .1650E+07 .5571E+06 2.9615 101 2.00 14 .9504E+06 .3914E+06 2.4280 27 1.00 14 .1111E+07 .5157E+06 2.1543 101 2.00 14 .1101E+07 .5900E+06 1.8663 1001 1.00 24 .2500E+06 .1657E+06 1.5086 1001 1.00 24 .3100E+06 .2057E+06 1.5069 101 2.00 24 .2392E+06 .1600E+06 1.4950 27 1.00 13 .8064E+06 .7229E+06 1.1156 64 1.00 13 .9184E+06 .8257E+06 1.1122 32 2.00 13 .6944E+06 .6271E+06 1.1072 8 1.00 ------ ----- -------- --------- ---- ------ frac. weights = .2500 average rate = 2.4708 mega-flops/sec. harmonic mean = 2.0450 mega-flops/sec. standard dev. = .9548 mega-flops/sec. 1 sensitivity analysis the sensitivity of the harmonic mean rate (mflops) to various weightings is shown in the table below. seven work distributions are generated by assigning two distinct weights to ranked kernels by quartiles. forty nine possible cpu workloads are then evaluated using seven sets of values for the total weights: ------ ------ ------ ------ ------ ------ ------ 1st qt: o o o o o x x 2nd qt: o o o x x x o 3rd qt: o x x x o o o 4th qt: x x o o o o o ------ ------ ------ ------ ------ ------ ------ total weights net mflops: x o ---- ---- 1.00 .00 2.05 2.96 5.35 7.01 10.19 13.03 18.03 .95 .05 2.13 3.08 5.31 6.71 9.49 11.13 15.25 .90 .10 2.22 3.21 5.27 6.43 8.88 9.72 13.21 .80 .20 2.42 3.50 5.20 5.94 7.86 7.75 10.42 .70 .30 2.66 3.85 5.12 5.51 7.05 6.45 8.60 .60 .40 2.95 4.28 5.05 5.14 6.40 5.52 7.33 .50 .50 3.32 4.82 4.99 4.82 5.85 4.82 6.38 ---- ---- ------ ------ ------ ------ ------ ------ ------ sensitivity of net mflops rate to use of optimal fortran code(sisd/simd model) 2.97 3.57 4.46 5.97 7.17 8.99 12.05 14.52 18.25 .00 .20 .40 .60 .70 .80 .90 .95 1.00 fraction of operations run at optimal fortran rates 1 cumulative checksums: run= 1 k vl= 471 90 19 1 .5114652869602179E+05 .5253344959762263E+03 .3855104634738123E+02 2 .5150345375910401E+03 .5150345375910401E+03 .1199847611784935E+02 3 .1000742882775492E+02 .1009741436997868E+01 .2699309090770962E+00 4 .5999250560998917E+00 .5999250560998917E+00 .5999250560998917E+00 5 .4548871642006936E+04 .4589031936855670E+02 .3182615249296590E+01 6 .3269630455327118E+25 .2801223712401726E+29 .5005698205104486E+17 7 .6104251078096059E+05 .6345586321748780E+03 .2845720222862600E+02 8 .1501268005821899E+06 .1501268005821899E+06 .2960543665193022E+04 9 .1189443610576381E+06 .1189443610576381E+06 .2623968458796409E+04 10 .7310369782389700E+05 .7310369782389700E+05 .1651291226431728E+04 11 .3342910972387299E+08 .3433560406718403E+05 .6551161299645901E+03 12 .3037601709365844E-04 .7033348083496094E-05 .1937150955200195E-05 13 .2217514175186118E+11 .1430504295720575E+11 .3899370201627166E+10 14 .2982147272226224E+10 .3056357414740366E+08 .2645050235527276E+07 15 .3943816686146150E+05 .3943816686146150E+05 .1108997284845385E+04 16 .1413260000000000E+06 .1620410000000000E+06 .1288160000000000E+06 17 .1114641772491289E+04 .1114641772491289E+04 .2947368635898508E+02 18 .6203834982018809E+05 .6203834982018809E+05 .9700646258667111E+03 19 .5421816956926782E+03 .5421816956926782E+03 .1268230709758903E+02 20 .3040644245595724E+08 .3126205060493570E+05 .5987713128405817E+03 21 .4017185710665298E+08 .1961994538228350E+08 .1253425674274844E+08 22 .2938604382289496E+03 .2938604382289496E+03 .6109968805961940E+01 23 .3549900501737382E+05 .3549900501737382E+05 .4850340588614345E+03 24 .5000000000000000E+03 .5000000000000000E+02 .1300000000000000E+02 cumulative checksums: run= 7 k vl= 471 90 19 1 .3580257008721524E+06 .3677341471833584E+04 .2698573244316686E+03 2 .3605241763137281E+04 .3605241763137281E+04 .8398933282494545E+02 3 .7005200179428446E+02 .7068190058985076E+01 .1889516363539674E+01 4 .4199475392699242E+01 .4199475392699242E+01 .4199475392699242E+01 5 .3184210149404855E+05 .3212322355798968E+03 .2227830674507612E+02 6 .2288741318728983E+26 .1960856598681209E+30 .3503988743573140E+18 7 .4272975754667242E+06 .4441910425224146E+04 .1992004156003820E+03 8 .1050887604075329E+07 .1050887604075329E+07 .2072380565635115E+05 9 .8326105274034671E+06 .8326105274034671E+06 .1836777921157486E+05 10 .5117258847672790E+06 .5117258847672790E+06 .1155903858502209E+05 11 .2340037680671110E+09 .2403492284702882E+06 .4585812909752131E+04 12 .2126321196556091E-03 .4923343658447266E-04 .1356005668640137E-04 13 .1552259922630282E+12 .1001353007004402E+12 .2729559141139016E+11 14 .2087503090558357E+11 .2139450190318256E+09 .1851535164869094E+08 15 .2760671680302306E+06 .2760671680302306E+06 .7762980993917696E+04 16 .9892820000000000E+06 .1134287000000000E+07 .9017120000000000E+06 17 .7802492407439027E+04 .7802492407439027E+04 .2063158045128956E+03 18 .4342684487413166E+06 .4342684487413166E+06 .6790452381066978E+04 19 .3795271869848748E+04 .3795271869848748E+04 .8877614968312321E+02 20 .2128450971917007E+09 .2188343542345499E+06 .4191399189884072E+04 21 .2812029997465708E+09 .1373396176759844E+09 .8773979719923909E+08 22 .2057023067602647E+04 .2057023067602647E+04 .4276978164173358E+02 23 .2484930351216168E+06 .2484930351216168E+06 .3395238412030041E+04 24 .3500000000000000E+04 .3500000000000000E+03 .9100000000000000E+02 1 table of speed-up ratios of mean rates (72 samples) arithmetic, geometric, harmonic means (am,gm,hm) the geometric mean is the least biased statistic. -------- ---- ------ -------- -------- -------- -------- -------- -------- system mean mflops ymp1 3090s180 IBM RS/6 c180-875 m/2000 vax-785 -------- ---- ------ -------- -------- -------- -------- -------- -------- cray am= 78.230 : 1.000 4.455 8.390 19.364 19.412 285.511 ymp1 gm= 36.630 : 1.000 2.995 5.234 10.008 10.175 140.885 cft771.2 hm= 17.660 : 1.000 1.958 3.643 5.401 5.697 71.789 sd= 86.750 ibm am= 17.560 : .224 1.000 1.883 4.347 4.357 64.088 3090s180 gm= 12.230 : .334 1.000 1.748 3.342 3.397 47.038 vsf2.2.0 hm= 9.020 : .511 1.000 1.861 2.758 2.910 36.667 sd= 16.320 IBM RS/6 am= 9.324 : .119 .531 1.000 2.308 2.314 34.028 IBM RS/6 gm= 6.998 : .191 .572 1.000 1.912 1.944 26.915 xlf -O ( hm= 4.847 : .274 .537 1.000 1.482 1.564 19.705 sd= 6.434 cdc am= 4.040 : .052 .230 .433 1.000 1.002 14.745 c180-875 gm= 3.660 : .100 .299 .523 1.000 1.017 14.077 ftn 1.6 hm= 3.270 : .185 .363 .675 1.000 1.055 13.293 sd= 1.720 mips am= 4.030 : .052 .229 .432 .998 1.000 14.708 m/2000 gm= 3.600 : .098 .294 .514 .984 1.000 13.846 f77 1.31 hm= 3.100 : .176 .344 .640 .948 1.000 12.602 sd= 1.680 dec am= .274 : .004 .016 .029 .068 .068 1.000 vax-785 gm= .260 : .007 .021 .037 .071 .072 1.000 f77 4.2 hm= .246 : .014 .027 .051 .075 .079 1.000 sd= .080 1 version: 22/dec/86 mf392 check for clock calibration only: total job cpu time = .14365E+03 sec. total 24 kernels time = .12464E+03 sec. total 24 kernels flops= .84745E+09 flops -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@vax1.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET