Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!voder!nsc!amdahl!amdcad!crackle!tim
From: tim@crackle.amd.com (Tim Olson)
Newsgroups: comp.arch
Subject: Re: Compiling - RISC vs. CISC
Message-ID: <26325@amdcad.AMD.COM>
Date: 12 Jul 89 15:20:23 GMT
References: <13976@lanl.gov> <25547@shemp.CS.UCLA.EDU> <26247@amdcad.AMD.COM> <25562@shemp.CS.UCLA.EDU> <26257@amdcad.AMD.COM> <151@ssp1.idca.tds.philips.nl>
Sender: news@amdcad.AMD.COM
Reply-To: tim@amd.com (Tim Olson)
Organization: Advanced Micro Devices, Inc. Sunnyvale CA
Lines: 30
Summary:
Expires:
Sender:
Followup-To:

In article <151@ssp1.idca.tds.philips.nl> roelof@idca.tds.PHILIPS.nl (R. Vuurboom) writes:
| Anybody care to quantify this? Just what sort of performance improvement can I
| expect from no-holds-barred optimization over only-what-I-have-to optimization.?

Here are a couple of data points.  The only optimizations performed by
the internal pcc-derived compiler were delayed-branch slot filling,
loop-rotation, leaf-procedure optimization (no frame allocated), and
some loop-invarient code-motion (mainly constant addresses).  We can
compare this to the MetaWare High-C compiler for the Am29000, which
performs many more optimizations, including common-subexpression
elimination, dead-code elimination, constant and variable propagation,
register assignment by coloring, etc:

Benchmark	VAX 11/780 pcc	29K pcc		29K MetaWare

diff		   3.2 s	0.208 s		0.157 s (+32%)
grep		   2.1 s	0.193 s		0.142 s (+35%)
nroff		   7.1 s	0.564 s		0.507 s (+11%)

(29K simulation model was 25MHz, with separate 8Kbyte caches, which were
2-cycle first access, single cycle burst.)

So you can see that there is a definite improvement, but it certainly
isn't the 3X - 5X implied by the assertion that "you have to use
highly-optimizing compilers with RISC, otherwise you might as well use a
CISC processor."

	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)