Path: utzoo!mnetor!uunet!husc6!hao!ames!sdcsvax!ucsdhub!hp-sdd!hplabs!hpcea!hpnmd!hpsrla!brucek From: brucek@hpsrla.HP.COM (Bruce Kleinman) Newsgroups: comp.arch Subject: Re: "Life" benchmarks Message-ID: <3460007@hpsrla.HP.COM> Date: 31 Dec 87 22:39:41 GMT References: <438@pcrat.UUCP> Organization: HP Network Measurements Div - Santa Rosa, CA Lines: 74 +-------- | I wonder what current processors are capable of on the "life" benchmark? +-------- BEFORE I POST ANY NUMBERS, I wish to make it clear that the performance achieved with BKLife is mostly due to the algorithm, not the processor. I assume that the algorithms used with the Blitter and the Cytocomputer used a brute-force approach (processing all cells in the universe, or perhaps a bounded area of cells). Even fairly dense life patterns rarely exceed 10 percent density; that is, only one in ten cells are occupied at any time. A brief analysis shows that a brute-force algorithm will spend most of its time on these empty cells. Enter the BKLife algorithm, a fairly efficient life routine which achieves very high performance on very large universes. The routine tracks all living cells, and processes only the appropriate regions of the universe. Empty cell areas have no effect on performance. AS FOR WHICH NUMBERS TO POST, the concept of "life pixels/second" isn't valid for an adaptive algorithm, as I can simply place a blinker (or any other very small pattern) in the middle of a huge virtual universe (say, 8192 by 1024), and produce numbers in the vicinity of 10 billion life pixels/second. Makes an impressive marketing benchmark, huh? Anyway, it has no real value, as only 15 of the 8M cells are being recalculated each generation. After long and careful consideration, my associates and I arrived on the puffer train as the 'standard' life pattern for our benchmarks. The puffer produces an active tail as it progresses through the universe. The tail does not stabilize until generation 5533, so the pattern is fairly interesting to boot. Benchmarks run on workstations do not include screen updates, as I/O has a nasty habit of slowing things down ;-). Benchmark number one is the first 1000 generations of the puffer train, which must be run on at least a 600 by 600 universe to avoid boundary problems. This configuration requires about 0.7MB with BKLife, which makes it manageable on smallish machines. Benchmark number two is the first 5600 generations of the puffer train, which requires at least 6000 by 600 universe. This requires about 6MB with BKLife. AT LAST THE NUMBERS, which are for a Hewlett Packard 9000 series 350 (25 MHz 68020 box) with 8 MBytes DRAM, running version 5.5 of HP-UX. Benchmark 1 ... 19.6 s = 50.9 puffer generations per second (pGPS) Benchmark 2 ... 1118 s = 5.0 puffer generations per second (pGPS) As the numbers show that the puffer train grows rapidly, which slows down the algorithm. Further, BKLife accesses memory with a rather odd sense of locality, and at a certain point the d-cache becomes useless. Although the user interface does not yet support it, BKLife was designed to be used in two- or three-dimensional universes. The central algorithm is flexible enough to handle an arbitrary number of dimensions with absolutely no modifications. Additionally the life rule (Conway's life rule is 2233) can be specified, allowing more investigation. INTERESTED PARTIES can contact me via e-mail for a copy of the code. It was initially written entirely in C, using the curses library for I/O (a version using X Windows is in the works). The critical paths were later coded in 68020 assembly, but I still have C versions of these routines. I also have a selection of interesting life patterns, including the ever-popular glider gun and space rake. Furthermore, the package includes both a user guide and a reasonable discussion of the code itself (documentation - hard to believe). Bruce Kleinman Hewlett Packard - Network Measurements Division brucek@hpnmd.hp.com ...ucbvax!hplabs!hpnmd!brucek