Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!apple!voder!nsc!amdahl!amdcad!crackle!tim From: tim@crackle.amd.com (Tim Olson) Newsgroups: comp.arch Subject: Re: Compiling - RISC vs. CISC Message-ID: <26325@amdcad.AMD.COM> Date: 12 Jul 89 15:20:23 GMT References: <13976@lanl.gov> <25547@shemp.CS.UCLA.EDU> <26247@amdcad.AMD.COM> <25562@shemp.CS.UCLA.EDU> <26257@amdcad.AMD.COM> <151@ssp1.idca.tds.philips.nl> Sender: news@amdcad.AMD.COM Reply-To: tim@amd.com (Tim Olson) Organization: Advanced Micro Devices, Inc. Sunnyvale CA Lines: 30 Summary: Expires: Sender: Followup-To: In article <151@ssp1.idca.tds.philips.nl> roelof@idca.tds.PHILIPS.nl (R. Vuurboom) writes: | Anybody care to quantify this? Just what sort of performance improvement can I | expect from no-holds-barred optimization over only-what-I-have-to optimization.? Here are a couple of data points. The only optimizations performed by the internal pcc-derived compiler were delayed-branch slot filling, loop-rotation, leaf-procedure optimization (no frame allocated), and some loop-invarient code-motion (mainly constant addresses). We can compare this to the MetaWare High-C compiler for the Am29000, which performs many more optimizations, including common-subexpression elimination, dead-code elimination, constant and variable propagation, register assignment by coloring, etc: Benchmark VAX 11/780 pcc 29K pcc 29K MetaWare diff 3.2 s 0.208 s 0.157 s (+32%) grep 2.1 s 0.193 s 0.142 s (+35%) nroff 7.1 s 0.564 s 0.507 s (+11%) (29K simulation model was 25MHz, with separate 8Kbyte caches, which were 2-cycle first access, single cycle burst.) So you can see that there is a definite improvement, but it certainly isn't the 3X - 5X implied by the assertion that "you have to use highly-optimizing compilers with RISC, otherwise you might as well use a CISC processor." -- Tim Olson Advanced Micro Devices (tim@amd.com)