Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!rutgers!nysernic!itsgw!batcomputer!pyramid!prls!mips!earl From: earl@mips.UUCP (Earl Killian) Newsgroups: comp.arch Subject: Re: register windows Message-ID: <837@gumby.UUCP> Date: Mon, 26-Oct-87 02:37:41 EST Article-I.D.: gumby.837 Posted: Mon Oct 26 02:37:41 1987 Date-Received: Wed, 28-Oct-87 01:31:54 EST References: <201@PT.CS.CMU.EDU> <933@cpocd2.UUCP> <821@mips.UUCP> <18855@amdcad.AMD.COM> Lines: 54 Keywords: register windows, interrupt latency Summary: more data In article <18855@amdcad.AMD.COM>, tim@amdcad.AMD.COM (Tim Olson) writes: > In article <833@mips.UUCP> hansen@mips.UUCP (Craig Hansen) writes: > | It should also be noted, again, that the selection of programs used as > | benchmarks will influence the results too. Making your architectural > | trade-offs on the basis of Dhrystone, or even nroff, which is not very > | representative of more modern C code, isn't very smart. > Making architectural decisions based on *any* single program isn't very > smart. You should examine a large body of code, looking at older, > heavily-used programs as well as more "modern" code (output of C++ > compilers, object-oriented programming). It sounds like everyone's in agreement, and yet, so far the discussion has talked about one program! What's interesting about programs is that some statistics are fairly consistent and some vary all over the place. To show the variance of the statistics relevant to this discussion, consider a wide range of programs (statistics for the MIPSco architecture): non-sp/gp non-sp/gp sp-based reg gp-based 0-offset non-0 offset loads stores ld/st ld/st ld/st ld/st ld/st ----- ----- ----- ----- ----- ----- ----- espresso 19.6% 1.1% 0.1% 0.1% 1.3% 18.7% 0.4% spice 26.9% 16.3% 7.2% 2.8% 4.2% 1.5% 27.5% wolf 25.3% 8.2% 7.1% 1.9% 3.6% 8.0% 12.9% yacc 15.7% 2.1% 0.9% 0.5% 2.5% 12.4% 1.5% diff 16.2% 3.2% 0.5% 0.7% 4.6% 7.2% 6.4% compress 18.3% 10.6% 0.1% 3.5% 8.1% 7.8% 9.4% uopt 21.8% 8.4% 5.6% 5.2% 1.2% 6.8% 11.4% as1 18.3% 11.2% 4.4% 6.8% 3.7% 3.8% 10.8% nroff 18.8% 8.6% 0.4% 7.7% 14.5% 3.1% 1.7% tex 21.9% 13.8% 3.6% 9.2% 10.8% 5.1% 7.0% ccom 18.7% 12.2% 3.4% 11.9% 3.8% 5.0% 6.9% doduc 29.4% 10.2% 10.1% 4.1% 12.3% 1.5% 11.6% The sum of the last 5 columns should be equal to the sum of the first 2. In other words the last 5 are a partition of the loads and stores into sp-based, register saves and restores at procedure entry/exit, gp-based (that is small static variables addressed by a dedicated register, which always have a nonzero offset on the MIPSco machine), other load/stores with zero offset, and other load/stores with nonzero offset. Everything is a percentage of the total instructions executed. My conclusion: be careful of drawing conclusions from a small number of data points. For example, when deciding whether to have an offset for load/store, looking at just espresso (0.4% nonzero) or just spice (27.5% nonzero) would lead to very different results. Caveat: this statistics are from one architecture / compiler. Your actual mileage may vary. In particular, I would bet the 29k compilers probably try hard to avoid needing offsets. To the extent that these techniques succeed, it would reduce the 27.5% slowdown you'd expect in spice. However, I think these statistics do give some indication of the desirability of having a load/store offset.