Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!lll-lcc!pyramid!prls!mips!mash From: mash@mips.UUCP (John Mashey) Newsgroups: comp.arch,comp.sys.nsc.32k Subject: Re: Performance of the 532 Message-ID: <374@winchester.UUCP> Date: Thu, 7-May-87 15:20:35 EDT Article-I.D.: winchest.374 Posted: Thu May 7 15:20:35 1987 Date-Received: Sat, 9-May-87 09:39:22 EDT References: <324@dumbo.UUCP> <809@killer.UUCP> <2417@homxa.UUCP> <4294@nsc.nsc.com> Reply-To: mash@winchester.UUCP (John Mashey) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 56 Xref: mnetor comp.arch:1212 comp.sys.nsc.32k:136 In article <4294@nsc.nsc.com> grenley@nsc.UUCP (George Grenley) writes: > >So, here are some simulated facts: Our design team has done simulations >of the chip's performance, both with ideal 0 wait state memory and with >"real world" typical VME bus memory. We ran some unix utilities, including >our own compilers, etc. I will divulge a few of these numbers now, and more >later. (If I don't get burned for this): >Grep ran at 8.4 mips from 0 ws memory, 7.9 from VME. Grep was one of the >best. One of the worst was our assembler, it hit 5 mips from 0 ws, and >4.5 mips from VME. On the average (these two plus several other CPU >intensive programs) the '532 hit 6.1 mips from 0 ws, 5.3 mips from VME. Could you say a little more on the configurations: cache size, nature [write-back or write-thru] if write-thru, did you use write buffers, and if so, how deep. exactly what the assumptions were on the VME memories It would also be interesting [although I realize this might be sensitive info] to get more info on the simulations, to be able to make a read on the accuracy of the simulations: instruction cycles TLB-miss cycles cache-miss cycles [if present] write-buffer stall & write/read interlock cycles >So, here's the deal. I invite Mot, Intel, and other interested parties >to work with me in defining some sort of realistic benchmark, which we'll >run (in public). I expect to have system level hardware late this year, >so if we get started now, we'll have very interesting Xmas presents... I think that's a great idea and am delighted that somebody has suggested it. Presumably there will be 68030s benchmarkable in hardware by then, and certainly 386s, Clippers, and WE32200s. As a first suggestion, I'd observe that there are at least the following classes of realistic benchmarks: 1) Large FORTRAN / C floating-point ones [and there are many of these that are widely available]. One probably needs at least 5-10 of these to cover the different sorts of things that people do. 2) Large integer benchmarks: this is the real tough category: most of the larger, realistic ones tend to be proprietary codes, or else things where the code [like for assemblers, compilers, etc] inherently differs among systems. this also needs 5-10 of them, and could at least include a few of the larger UNIX utilities, although most of them fit into reasonable-sized caches, and hence don't stress things the way larger applications do. 3) Multi-user and/or systems benchmarks, using UNIX. Run shell scripts, etc. I'dthink there should at least be a few of these. One might want to focus on 1&2, if only to avoid the arguments on 3 regarding different peripheral choices, operating system tuning, etc, unless the shootout is intended as an OS shootout also. -- -john mashey DISCLAIMER: UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086