Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!sol.ctr.columbia.edu!sdsu!ncr-sd!conan!steves
From: steves@conan.SanDiego.NCR.COM (Steve Schlesinger)
Newsgroups: comp.arch
Subject: Re: Neal Nelson Benchmarks
Message-ID: <2557@ncr-sd.SanDiego.NCR.COM>
Date: 23 Feb 90 18:37:38 GMT
References: <196@zds-ux.UUCP>
Sender: news@ncr-sd.SanDiego.NCR.COM
Reply-To: steves@conan.SanDiego.NCR.COM (Steve Schlesinger)
Organization: NCR Corporation, Rancho Bernardo
Lines: 102

In article <196@zds-ux.UUCP> gerry@zds-ux.UUCP (Gerry Gleason) writes:
>I have just been going through a bunch of marketing hype for Neal
>Nelson.  He claims that his "Business Benchmark" measures how
>well machines perform on "tasks like word processing, spread sheets,
>database management, accounting, programming and CAD," but I have
>never seen anything that backs this up with analysis or real data.
>
>  [ paragraph deleted ]
>
>I was hoping that someone has already done some analysis of these
>benchmarks, and can confirm my suspicion that these test not only
>are bogus, but don't even measure what they claim to.  Unfortunately,
>at least some important fraction of the market uses these benchmarks
>to evaluate products, so many of us must apply them to our products
>even though we suspect them of being misleading.  If they really are
>bogus, what can be done to publicly discredit them, so further harm
>is not done?
>
>Gerry Gleason

***************************************************************
	The following is only my opinion 
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 

My company is a licensee of the Neal Nelson Benchmarks.  I have
been involved in benchmarking and performance evaluation for many
years.  I do not have a very high opinion of the NN Benchmarks.

NN is covered by a very strict licensing agreement.  The results of
running the benchmarks cannot be published, ie.  a licensee cannot
publicly reveal the individual or composite results of the benchmarks.
You cannot say my system ran 70 gazillion dhrystones, 80 gathousand
Linpacks and 22 on the NN suite.  (Heck, another division of my company
was also a licensee and they couldn't event tell us their raw results!!
Oh yes, the license agreement only permits the source to be on
a single machine on a single site.)

All you can say is that your system was Y times the performance of 
Fasta Computer - Model A on the suite.  How do you know this, if Fasta
Computer didn't publish their numbers ?  NN tells you this as part of
your license agreement.  You report your numbers back to them
and they give you the relative numbers of a specified number of other
systems.  If you want more data, you pay NN more $$.

On the technical side, the benchmarks are **VERY** simple.  I cannot
reveal the details under the license agreement.  The only good thing
is that you can run multiple copies of the suite in parallel fairly
easily.

The computational benchmarks are of two types: arithmetic for different
data types and memory moves.  The arithmetic ones over emphasize the
frequency of multiply and divide relative to plus and minus in real programs.
This explains some of the article mentioned where a CISC machine
(Sun3 with 68020 and 68881 or other FP silicon) "beat" a RISC machine
(Sun4 with SPARC which doesn't have multiply/divide for integer arithmetic).
The memory move tests will show how the cache/memory perform FOR ONE
SPECIFIC TYPE OF MOVES.

The disk I/O tests are not as idiosyncratic as the processor memory tests,
but they are **VERY** simple.

What really bothered me about the tests was the "C" coding style.  Yes,
I know this doesn't necessarily mean the benchmarks are not meaningful.
The code looked like it had originally been written in Cobol, then
translated line for line into "C".  The program structure (what there
was of it) didn't look anything like any "C" program anywhere.
It said to me the author had little experience programming in "C".

One effect of the coding style, was that it made it difficult to optimize.
This can be seen two ways:  one, is that it makes the suite more accurate,
since it removes the variable of compiler optimization from System comparisons
(a "notorious" problem with Dhrystone - especially 1.1).  The other is that
it makes it less accurate since the compiler's ability to optimize real
code is an important attribute of a system.

I tend to the second view.  But, since NN promotes his suite as a measure of
system performance it seems to me the code should test the optimizer.

My recommendation is not to pay much attention to anything you see published
about results on the NN suite.  I place it slightly below dhrystone in
overall usability.  You can get similar results by spending several hours
enhancing the public domain byte benchmarks.

I look forward to a future suite from SPEC that will include system I/O tests.
If these benchmarks are anything like SPEC Release 1.0, the NN suite will
be technically obsolete.

I admire NN's ability as a business person.  He saw a need and filled it.
Most technically naive system buyers don't understand the performance data
that floats around.  Observe the constant misunderstandings on comp.arch
about what a "mip" is.  We techies seldom agree on much.  NN created
a simple set of tests that could be explained to Joe Naiveuser.  Joe N.
could understand the marketing hype of the test and it gave him confidence
in his computer purchase.

Steve Schlesinger

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 
	The preceding is only my opinion
	It does not reflect the opinion of my employer or
	any other person or organization.
***************************************************************