Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!sdd.hp.com!hplabs!hpda!hpcuhc!spuhler From: spuhler@hpcuhc.cup.hp.com (Tom Spuhler) Newsgroups: comp.benchmarks Subject: Re: Re: unbc - A New!, Improved! bc benchmark (nope) Message-ID: <115440004@hpcuhc.cup.hp.com> Date: 20 Dec 90 01:28:46 GMT References: <7710@eos.arc.nasa.gov> Organization: GSY Systems Performance Section Lines: 79 # >for your faster CPU's? Does management want a richer instruction mix to # # Er, sorry, I must be dense, but where does the "richer instruction mix"(tm) # come in (sounds like coffee, thank god I drink tea). Seems like more of Come on, Eugene, you're tripping over the easy ones:-) Richer instruction mix means a more varied, or using a larger subset of the machine insructions. Not particullary interesting, as the important criteria is how the tested instruction mix matches your expected workloads(for richer or poorer:-) but, I get more warm fuzzies from tests that exercise the 'richer' mixes then the 'poorers' as real life usage tends to be on the richer side (for the kinds of computers I'm interested in). Was common terminology around here. I didn't invent it (now, as to the concept of "creamier" code, I'll take some blame on that). # the same. Do you work per change in a marketing department? Longer running? # Longer is not necessarily better (no sex jokes please). Seems this could Sorry, no, to the marketing question. Longer is better in that it tends to minimize the lack of precision of the reporting mechanism (in this case /bin/time) and the impact of startup effects (something of conern in the 'bc' benchmark) will be minimized. When the run times drop below a couple of seconds, I personally start to worry about the precision of /bin/time. I like um to run at least 10 seconds. Unfortunately, I didn't achieve that goal with 2^9999/3^6308. On some systems, I expect it can run in less then a second, but I was limited by the 'bc' program and my interests in simplicity. Longer is not 'necessarily' better, but I find it usually is for accuracy in results, although 'longer' may reduce the number or times it's run or it's usefulness, which may be more important. # Longer is not necessarily better (no sex jokes please). Seems this could # be optimized as well. Fortunately (?) I didn't see the beginnings of the Optimizable? Oh sure. This is always true. Vendors could hard code in the answer. It's a question of ease, likelyhood, and dependence. How hard is it to optimize for this case? 2^9999/3^6308 is harder to optimize for then 2^5000/2^5000, assuming for more then just the hard-coded case (easy to detect) and somewhat consistent with the intent of 'bc'. How likely is someone likely to do something like that? Depends on how hung up the world gets on a single benchmark. How likely is someone going to optimize for Dhrystone? (Seems to have hppened). It's all a matter of contest. # > It is better to have some data, no matter how limited, as long # > as you understand it, then no data at all. # # Nope. Beg to disagree. It can be more damaging. I think some one is suing # someone else over performance claims, getting nasty. # Note: in a first post, I cited the APL benchmark (Gaussian sum) where # the adds were all replaced by the simple (n+1)n/2 formula (n was = 256). # We always have to live with imperfect information. True, the results of a benchmark running your applications(s) on a variety of vendor machines with a variety of configurations is ideal, it can be a little expensive to achieve. Something like the bc or nbc benchmarks may be not very good, but they are cheap to run. Results from a good number of machines are available. Note that the results of both efforts may be no more useful (or less useful) to someone else in determining the relative performances of the tested boxes. And guess which one cost less. Using bc, or better nbc can help classify systems and direct other investigative efforts. The combination of bc and nbc results is considerably more useful then either one alone. Keep adding in more benchmarks and you can develop a performance profile of a system. Does SPEC alone allow one to characterize the performance of a system? Definately not. Does it help? Sure. How about TPC-A? For any single characterization, one can cite exceptions. Only the complete universe of information is universally useful. Performance information is damaging only if it is missued (happends a lot). ["there is no enlightenment until there is total enlightenment"]. # It's hard to understand the behavior of some benchmark results, even by # some of the programmer who wrote a given benchmark or compiler. and it's even harder to come up with a single all singing and dancing benchmark which will allow anyone to evalute the performance of a variety of boxes running whatever applications they choose. -Tom Spuhler, Spuhler@cup.hp.com