Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!sdsu!ncr-sd!conan!steves From: steves@conan.SanDiego.NCR.COM (Steve Schlesinger) Newsgroups: comp.arch Subject: Re: Neal Nelson Benchmarks Message-ID: <2557@ncr-sd.SanDiego.NCR.COM> Date: 23 Feb 90 18:37:38 GMT References: <196@zds-ux.UUCP> Sender: news@ncr-sd.SanDiego.NCR.COM Reply-To: steves@conan.SanDiego.NCR.COM (Steve Schlesinger) Organization: NCR Corporation, Rancho Bernardo Lines: 102 In article <196@zds-ux.UUCP> gerry@zds-ux.UUCP (Gerry Gleason) writes: >I have just been going through a bunch of marketing hype for Neal >Nelson. He claims that his "Business Benchmark" measures how >well machines perform on "tasks like word processing, spread sheets, >database management, accounting, programming and CAD," but I have >never seen anything that backs this up with analysis or real data. > > [ paragraph deleted ] > >I was hoping that someone has already done some analysis of these >benchmarks, and can confirm my suspicion that these test not only >are bogus, but don't even measure what they claim to. Unfortunately, >at least some important fraction of the market uses these benchmarks >to evaluate products, so many of us must apply them to our products >even though we suspect them of being misleading. If they really are >bogus, what can be done to publicly discredit them, so further harm >is not done? > >Gerry Gleason *************************************************************** The following is only my opinion * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * My company is a licensee of the Neal Nelson Benchmarks. I have been involved in benchmarking and performance evaluation for many years. I do not have a very high opinion of the NN Benchmarks. NN is covered by a very strict licensing agreement. The results of running the benchmarks cannot be published, ie. a licensee cannot publicly reveal the individual or composite results of the benchmarks. You cannot say my system ran 70 gazillion dhrystones, 80 gathousand Linpacks and 22 on the NN suite. (Heck, another division of my company was also a licensee and they couldn't event tell us their raw results!! Oh yes, the license agreement only permits the source to be on a single machine on a single site.) All you can say is that your system was Y times the performance of Fasta Computer - Model A on the suite. How do you know this, if Fasta Computer didn't publish their numbers ? NN tells you this as part of your license agreement. You report your numbers back to them and they give you the relative numbers of a specified number of other systems. If you want more data, you pay NN more $$. On the technical side, the benchmarks are **VERY** simple. I cannot reveal the details under the license agreement. The only good thing is that you can run multiple copies of the suite in parallel fairly easily. The computational benchmarks are of two types: arithmetic for different data types and memory moves. The arithmetic ones over emphasize the frequency of multiply and divide relative to plus and minus in real programs. This explains some of the article mentioned where a CISC machine (Sun3 with 68020 and 68881 or other FP silicon) "beat" a RISC machine (Sun4 with SPARC which doesn't have multiply/divide for integer arithmetic). The memory move tests will show how the cache/memory perform FOR ONE SPECIFIC TYPE OF MOVES. The disk I/O tests are not as idiosyncratic as the processor memory tests, but they are **VERY** simple. What really bothered me about the tests was the "C" coding style. Yes, I know this doesn't necessarily mean the benchmarks are not meaningful. The code looked like it had originally been written in Cobol, then translated line for line into "C". The program structure (what there was of it) didn't look anything like any "C" program anywhere. It said to me the author had little experience programming in "C". One effect of the coding style, was that it made it difficult to optimize. This can be seen two ways: one, is that it makes the suite more accurate, since it removes the variable of compiler optimization from System comparisons (a "notorious" problem with Dhrystone - especially 1.1). The other is that it makes it less accurate since the compiler's ability to optimize real code is an important attribute of a system. I tend to the second view. But, since NN promotes his suite as a measure of system performance it seems to me the code should test the optimizer. My recommendation is not to pay much attention to anything you see published about results on the NN suite. I place it slightly below dhrystone in overall usability. You can get similar results by spending several hours enhancing the public domain byte benchmarks. I look forward to a future suite from SPEC that will include system I/O tests. If these benchmarks are anything like SPEC Release 1.0, the NN suite will be technically obsolete. I admire NN's ability as a business person. He saw a need and filled it. Most technically naive system buyers don't understand the performance data that floats around. Observe the constant misunderstandings on comp.arch about what a "mip" is. We techies seldom agree on much. NN created a simple set of tests that could be explained to Joe Naiveuser. Joe N. could understand the marketing hype of the test and it gave him confidence in his computer purchase. Steve Schlesinger * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * The preceding is only my opinion It does not reflect the opinion of my employer or any other person or organization. ***************************************************************