Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!think.com!mintaka!ogicse!uidaho!tamaluit.phys.uidaho.edu!pbickers From: pbickers@tamaluit.phys.uidaho.edu (Paul Bickerstaff) Newsgroups: comp.benchmarks Subject: harmonic series (was Re: more bc babble) Message-ID: <1990Dec15.094523.10622@groucho> Date: 15 Dec 90 09:45:23 GMT References: <1990Dec11.163826.5439@eagle.lerc.nasa.gov> Sender: @groucho Reply-To: pbickers@tamaluit.phys.uidaho.edu (Paul Bickerstaff) Organization: mrc Lines: 77 > > Another easy-to-memorize benchmark is the computation of the sum > of the first 10 million terms in the harmonic series. I've also used this as quick but (very rough) guide to floating point speed (but mainly to orient myself on a new system ). > This is a FORTRAN version, it should not be too hard to translate > even without f2c :-) > > PROGRAM RR > DOUBLE PRECISION R > R=0.0 > DO 10 I=1,10000000 ^^^^^^^^ As a general tutorial type comment, one should always sum series by doing the smallest terms first. This is for numerical accuracy. OK, the smallest here is 10^-7 and we're using double precision but the comment still stands as a matter of programming practice and summing in the reverse order may give very different results. > R=R+1/DBLE(I) > 10 CONTINUE > WRITE(*,*)R,I > END > > This one is obviously testing floating-point perfomance only. The Idon't think this is true. It is also testing a tight do loop. > emphasis on divisions might give biased results. It vectorizes ^^^^ This and other things *will* give biased results. Heck, *every* benchmark gives biased results. The trick is to choose a benchmark (or create your own) which matches your applications. There is not a single benchmark, MFLOPS, MIPS, SPECMARKS or whatever that means anything worthwhile if you don't know exactly how relevant it is to what you're doing. If the harmonic series has any value at all it is in educating people just how useless benchmarks are. eg. (I won't include exact code, but mine was double precision with reverse order of summation) (also, I only summed 1 million terms) IBM RS6000/320 2.14 xlf 1.41 xlf -O 1.33 xlf -O -Q'OPT(3)' (July '90 results) Mips Magnum 3000 1.2 f77 -O0 (ie no optimizations) 0.9 f77 (default level = f77 -O1) 0.5 f77 -O2 (Fortran 2.11 , RISCos 4.51) Times are all user times in secs. So how come a 3.6 MFLOP machine can run Fortran at about twice the speed of a 7.4 MFLOP machine? (Yes I have this the right way around!) Answer: Easy. (This article is not intended as IBM bashing. I have Fortran codes which do run much faster on the RS6000. The IBM does excell at the LINPACK benchmark but unless you do a lot of 100x100 array manipulations the 7.4 MFLOP LINPACK number clearly does n't mean much. Nor do the times for the harmonic series. ) Paul Bickerstaff Internet: pbickers@tamaluit.phys.uidaho.edu Physics Dept., Univ. of Idaho Phone: (208) 885 6809 Moscow ID 83843, USA FAX: (208) 885 6173