Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!seismo!rutgers!labrea!decwrl!pyramid!prls!mips!mash
From: mash@mips.UUCP (John Mashey)
Newsgroups: comp.arch
Subject: Re: What with these Vector's anyways?
Message-ID: <558@winchester.UUCP>
Date: Sat, 1-Aug-87 19:06:49 EDT
Article-I.D.: winchest.558
Posted: Sat Aug  1 19:06:49 1987
Date-Received: Tue, 18-Aug-87 02:32:32 EDT
References: <218@astra.necisa.oz> <142700010@tiger.UUCP>
Reply-To: mash@winchester.UUCP (John Mashey)
Distribution: world
Organization: MIPS Computer Systems, Sunnyvale, CA
Lines: 59
Keywords: scalar vs. vectors, benchmarks, Dhrystone, sorting

In article <2425@ames.arpa> lamaster@ames.UUCP (Hugh LaMaster) writes:
...long, reasonable discussion on vector machines, benchmarking

....good discussion of where Dhrystone should be changed for different
environments.  100% agree, except for the following:
>Weicker's PROGRAM has been widely
>criticized, but the STATISTICS behind it are probably valid for records and
>pointers type code....

Actually, some of the pointer behavior isn't seem quite typical, at least in the
C version.  For example, about 50% of the loads/stores (on MIPS machines,
anyway) use 0-offsets, and the more typical percentages are 10-15% in
user-level C programs.

>It should be noted that one thing that Dhrystone does do "right" is make lots
>of procedure calls....

In general, I'd agree.  However, Dhrystone somewhat overemphasizes the
importance of a fast function call. On our systems, the numbers of
instructions/call for Dhrystone are:
35	-O3 [global + inter-procedural register allocation]
36	-O2 [global opt]  [call this typical]
41	-O1 [no global opt]

here are a few other numbers for integer user-level programs:
 54	nroff
 56	ccom
 57	uopt [global optimizer]
 69	tex
 85	as1 [1st passs of assembler]
350	espresso

and some for a few floating point programs, or at least with some FP:
 38	whetstone single
 48	whetstone single
140	hspice
370	timberwolf
735	DP linpack, FORTRAN

Of course, these are INTSTRUCTION COUNTS, not including cache/tlb degradation,
or multi-cycle instruction effects, but it certainly gives a gross idea
of what's going on.   Basically, Dhrystone does function calls 1.5X to 2X
more frequently than large user-level C programs.  Needless to say, this
effect makes VAXen look especially bad, relative to how they actually
perform on other programs.  [I'm not defending slow function calls, of course!]

Actually an amusing test was using the Pascal version on an 8600, using the
Pastel compiler, which avoids the VAX CALL instructions in favor of compiler-
constructed sequences.  This is NOT exactly comparable, since the performance
difference on character strings for Dhrystone is much better in the Pascal
version [fixed-lengths, rather than null-terminated], and since Pastel
optimizes better than Ultrix 1.2's cc.  Still, there was a 2X performance
increase. I'd guess the Dhrystone understates VAX performance (relative to
architectures with leaner calls) about 15-25%, given similar compilers.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086