Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!sol.ctr.columbia.edu!emory!utkcs2!de5 From: de5@ornl.gov (Dave Sill) Newsgroups: comp.benchmarks Subject: Re: Don't use bc (was: More issues of benchmarking) Message-ID: <1990Dec3.191756.15280@cs.utk.edu> Date: 3 Dec 90 19:17:56 GMT References: <122@thinc.UUCP> <5042@taux01.nsc.com> Sender: news@cs.utk.edu (USENET News System) Reply-To: Dave Sill Organization: Oak Ridge National Laboratory Lines: 75 In article <5042@taux01.nsc.com>, amos@taux01.nsc.com (Amos Shapir) writes: >[Quoted from the referenced article by ethan@thinc.UUCP (Ethan Lish of THINC)] >> >>Greetings - >> >> This _benchmark_ does *NOT* have a legitimate value! >> > >Sure it doesn't; I wonder how no one else noted this yet: "bc" is probably >the worst choice of a utility to benchmark by. It may not be rigorous, but it does have value. For one thing, it's short enough to be memorized and easily typed at a box at, say, an expo. >On most UNIX systems, it >just parses expressions, and forks "dc" to execute them ("dc" is a reverse- >polish string based numeric interpreter). So the results depend on how >fast your system forks, and how "bc" and "dc" communicate. How does that invalidate the results? That's like penalizing an optimizing compiler for taking shortcuts the other one didn't. If the bc benchmark runs faster on system A than it does on B because vendor A took the time to optimize bc, then good for them! The danger is not some inherent unreliability in the benchmark, it's in incorrectly interpreting the results. This highlights very well the fundamental danger of benchmarking: generalization. Just because one system outperforms another on, say, a floating point benchmark, doesn't mean that it will *always* outperform it on all floating point code. >The bottom line is: comparing "bc" runs on different systems is necessarily >comparing apples and oranges (or at least plums & prunes) unless you're >sure you have the same version of "bc", "dc", and UNIX. Results posted >here so far indicate most comparisons are indeed meaningless. Bc is bc. If it takes 2^5000/2^5000 and correctly calculates the result, what does it matter how it gets there? I.e., this benchmark measures bc's performance. Interpreting it as a hardware benchmark is fallacious, since hardware performance is only one factor in the result. mccalpin@perelandra.cms.udel.edu (John D. McCalpin) replies to the same article: > >Of course the biggest problem is that almost no one actually *uses* >`bc' for any large amount of computation, so no vendor has any >incentive to optimize its performance. Ah, but would it be better to benchmark something the vendor has expected and optimized? If you're looking for actual performance instead of theoretical peak performance, perhaps it better to throw something unexpected at them. >A secondary problem is that one could trivially optimize the benchmark >away by adding a constant-expression simplifier to `bc' before it >calls `dc', but everyone already knew that.... Yes, but that would be readily apparent, wouldn't it? And it wouldn't invalidate the test. You just need to keep in mind that you're testing bc, and its dependence on hardware performance is only indirect. >(Maple evaluated the expression on my SGI 4D/25 in 0.4 seconds wall >time). Exactly, so Maple is faster than bc. You can't interpret this to mean that the SGI is faster than all the other systems that take longer to do it with bc. -- Dave Sill (de5@ornl.gov) Martin Marietta Energy Systems Workstation Support Brought to you by Super Global Mega Corp .com