Path: utzoo!attcan!uunet!husc6!bbn!rochester!pt.cs.cmu.edu!k.gp.cs.cmu.edu!lindsay From: lindsay@k.gp.cs.cmu.edu (Donald Lindsay) Newsgroups: comp.arch Subject: Re: Benchmarking Summary: Program generators have amazing leverage Message-ID: <3285@pt.cs.cmu.edu> Date: 12 Oct 88 03:25:39 GMT References: <2220003@hpausla.HP.COM> <46500026@uxe.cso.uiuc.edu> <6683@nsc.nsc.com> <6684@nsc.nsc.com> <4263@wright.mips.COM> <6729@nsc.nsc.com> <10498@reed.UUCP> <4655@winchester.mips.COM> <6868@nsc.nsc.com> <1988Oct9.011633.13259@utzoo.uucp> <4853@winchester.mips.C Sender: netnews@pt.cs.cmu.edu Organization: Carnegie-Mellon University, CS/RI Lines: 55 In article <6899@nsc.nsc.com> grenley@nsc.nsc.com.UUCP (George Grenley) writes: >I've received a lot of email on b'marking; one individual pointed out that >the database community "scales" the size of the b'mark (i.e., size of dbase) >to the size of machine. An interesting idea. I think we should consider >taking some of the small integer b'marks, and "enlarge"them by having the >program call itself recursively in a non-trivial way. Then, the test would >consist of running the program at, say, 1 through 1000 levels of recursion, >or whenver you run out of RAM. Then, publish the performance numbers. >Comments? I am willing to volunteer to drive this if anyone (like, f'rinstance, >someone who can code better than me) wants to help. First, I am solidly behind the idea that the best benchmark is the user's application. That said, synthetic benchmarks might as well be as good as they can be. So, some guidelines: - the code working set must be adjustable, without upper bound. - the data working set, likewise. - the compiler must be prevented from inlining. - the compiler must be prevented from eliminating dead code. - the benchmark must be small, so that it can be presented in full in reports. (This avoids the "slight change" problem, as well as permitting easy shipment.) There is a fairly simple way to achieve these ends. Do not write a benchmark program: write a program which writes out the benchmark program. A simple loop in the Generator program allows the creation of arbitrarily large source files. (Since compilers can get bent by this, the Generator should also generate multiple source files.) The procedure names will be somewhat unimaginative: f0001, f0002, and so on. If the source files are in C, then it's fair to generate macros and macro calls, simply to reduce the file space requirements. Next, the Generator should write the code to fill an array with pointers to these functions. Similarly, we need a data array. Next, we need a portable routine which generates pseudo-random numbers. (Portable mostly means that it avoids arithmetic overflow.) The quality of the randomness is unimportant, as long as it doesn't get stuck at 0 or other such silliness. The generated program will use the randoms to form subscripts, either into the data array, or into the function pointer array. In this way, we may control the size of the working sets. Since the functions should (largely) be accessed via the array, inlining is defeated. Avoid dead code. I have no comment concerning the contents of the routines: the Generator is independent of this, and should be able to generate several benchmarks (for instance, an integer one, and a float one). Since the benchmarks must be told how "big" to be, the benchmark report form should be written as part of the benchmark. This must specify how many runs must be made, and with exactly what parameters. -- Don lindsay@k.gp.cs.cmu.edu CMU Computer Science