Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.csd.uwm.edu!cs.utexas.edu!uunet!zephyr.ens.tek.com!orca!shark!adamsc From: adamsc@shark.WV.TEK.COM (Chuck Adams) Newsgroups: comp.windows.x Subject: Re: Problems with XSTONES calculations in xbench Summary: Another look at Xstones calculations in xbench. Message-ID: <4411@orca.WV.TEK.COM> Date: 31 Aug 89 22:39:53 GMT References: <4344@orca.WV.TEK.COM> <716@megatek.UUCP> Sender: nobody@orca.WV.TEK.COM Lines: 97 > However, I'm not sure that the actual problem is in the code. My impression is > that the actual algorithm used is what it should be, and the text description > should be changed to reflect the code, not the other way around. > > Remember that the Xstones number is synthetic, and doesn't need to actually > represent any real comparisons. As long as different Xstones numbers can be > compared in some predictable fashion then all is well. I realize xstones are synthetic but as such are supposed to convey a warm and fuzzy feeling. In my case they do not because the results are exactly the opposite of what the documentation states. Claus explicitly stated "the weights are based on our experience ... " Claus seemed to have spent a lot of effort on working out the proposed weights because a lot of the documentation goes over them. I thinks Claus intended the weights to reward performance in certain areas and overlook minor deficiencies in other areas. > With the current algorithm, servers are rewarded if they have consistent > performance across all tested areas. Bad performance in any one area can > seriously effect the Xstones number. This is not the case. The current algorithm rewards exceptional performance in areas that are weighted less than others. Take for instance, Test case d: measured_value[0] = 50 weight[0] = 300 sun_value[0] = 100 measured_value[1] = 5 weight[1] = 600 sun_value[1] = 10 xstone by old algorithm = 15000 xstone should be = 10000 __ | | | | | | __ | | | | weight | | | | 600 | | | | ---- ---- weight | | | | 300 | | | | |__| | | | | | | |__| > With the modified code, servers are rewarded if any significant areas perform > very well, even if others perform absolutely abysmally. For a general benchmark > I suspect the first behavior is better than the second. If you really want the later than you will have to use a third algorithm to compute xstones. > It doesn't help that the benchmark system (Sun 3/50, R3, no fpu) runs arcs very > slowly. This allows any decent server to get arcStones in the hundreds of > thousands, if not the millions. Even though arcStones makes up a small > percentage of the final Xstone, a small percentage of a HUGE number is still a > large number. This seriously skews the Xstones values for such machines. The point is that xstones are seriously skewed by either algorithm. The old algorithm rewards elements of the test that have low weights. The new algorithm rewards elements of the test that have high weights. But the later is what the documentation states is intended. > These problems could be mitigated by using a better benchmark base. But I > really feel that the effect of the current algorithm gives a better comparison > base then using the algorithm described in the text, and implemented in your > patch. Again the algorithm is biased to favor machines that perform better at lesser weighted elements of the test. This is contrary to what the documentation states. > Of course, none of this is to imply the xbench is really a great benchmark. Its > biggest asset is that it comes up with one final number, which can be used as > a quick "general estimation" of a server's speed. A server could have a quite > low Xstones rating, and still be the best price/performance solution to a > particular application. Likewise, a server with a high Xstones rating could be > a real dog for some applications. But, for a quick reference, I believe the > Xstones number is usable. It at least sets a scale as a basis for further > benchmarking efforts. For all you golfers out there. Xstones is about as useful as standing on the tee box and throwing grass up in the air to tell how to play the hole. I rather doubt that it sets any kind of scale for benchmarking X. ---- chuck adams adamsc@orca.wv.tek.com {decvax ucbvax hplabs}!tektronix!orca!adamsc Interactive Technologies Division/Visual Systems Group Tektronix, Inc. P.O. Box 1000, M/S 61-049 Wilsonville, OR 97070 (503) 685-2589