Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!munnari.oz.au!labtam!graeme From: graeme@labtam.labtam.oz (Graeme Gill) Newsgroups: comp.benchmarks Subject: Re: X-terminal benchmarks Message-ID: <5572@labtam.labtam.oz> Date: 14 Nov 90 03:16:52 GMT References: <90315.122715HANK@BARILVM.BITNET> <43091@mips.mips.COM> Organization: Labtam Information Systems Pty. Ltd., Melbourne, Australia Lines: 84 In article <43091@mips.mips.COM>, rnovak@mips.com (Robert E. Novak) writes: > Are the sources for the benchmarks available? > From the look of the tests xbench was probably used to produce these figures. The other widely known test suit is x11perf. xbench as it was distributed over the network has some problems. One major one is that it does not disable or accept the no-expose events generated by its copy area tests. This can cause strange results as the host starts running out of memory trying to store all the events. There are also other peculiarities, ie the invert rectangles test only does one rectangle at a time, hence it is partially a measure of network latency rather than pure invert area performance. This is inconsistent with the other tests. x11perf is most useful for developmental work on servers, although it is possible to use its results to draw conclusions about the relative performance of an X server. I make use of x11perf extensively in verifying the results of my server optimisations. This often involves modifying and/or adding tests to the x11perf suite. > One of the base requirements for all SPEC benchmarks (as you may have > guessed) is that the results of a program can be mechanically verified > against known good results. This could be a difficult issue. Although the X specifications specify exact pixelisation rules for most graphical operations, some are deliberately relaxed - ie zero width lines - so that machine dependent hardware can be used. Since zero width lines are widely used by X applications, one cannot simply leave out these tests. The MIT X11R3 example server (which a number of vendors products are still based on) does not even meet the pixelisation rules for some operations. Verification would be almost impossible to do at the same time as speed testing, since efficient use of the X protocol calls for doing as many graphics operations as possible per packet, and reading back an image is a relatively slow operation. If you really wanted to 'cook' the results of a server could save all the commands and only render them on receiving a getimage request. The performance of all operations except getimage would then seem exceptionally fast. Verification of the graphics rendering is only part of the problem, as other commands would also have to be verified - ie window creation, cursor operation, exposures etc etc. It would certainly be very useful to have an X pixelisation verification suite, but this seems to be a difficult project, as the closest thing available for the MIT consortium is a partially completed X protocol verification suite. If such a tool was available then one could use it to verify the correct functioning of the device under test, and then run the performance benchmarks, but whether this is what you are looking for, I don't know. The other slight possibility would be to come up with a series of operations that leave you with a (hopefully) unique pattern that can then be verified. The current benchmarking tools only cover a small fraction of the spectrum of drawing operations that may differ markedly in speed on a particular X server. For instance, X allows the 16 boolean logical operations, but generally only fill and invert are tested by benchmarks. Many servers will special case these two ops as they are the ones used by applications the vast majority of the time. The speed of textured fills will vary markedly with the size of the texture pattern used, since servers may have 2 or 3 different algorithms depending on whether a line of the pattern will fit in a register or whether it is even, and therefore doesn't need bit shifting. Performance will vary widely depending on the number of operations that can be grouped together (ie the size of the poly fill rect request etc.), the frequency of sync commands etc. Benchmarking of X terminals is especially difficult since many of the results will depend on the speed and exact implementation of the host machines communication interface - ie Ethernet, TCP/IP etc., and how that interacts with the server communications. X servers can have notoriously uneven performance, so that two applications that make different demands on the X server may vary markedly in relative speed when running on different servers. In summary, trying to benchmark X servers in a fair way may make CPU benchmarking look very simple. Considerable investigation of the issues that may affect performance is needed. If perfect verification of operation is needed, then benchmarking may not be possible at all. Graeme Gill Labtam I.S. Pty. Ltd. graeme@labtam.oz.au