Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!sri-unix!sri-spam!ames!oliveb!pyramid!prls!mips!mash From: mash@mips.UUCP (John Mashey) Newsgroups: comp.arch Subject: Re: Japanese 32-bit CPUs ( NEC V70 Message-ID: <411@winchester.UUCP> Date: Thu, 21-May-87 05:39:14 EDT Article-I.D.: winchest.411 Posted: Thu May 21 05:39:14 1987 Date-Received: Sat, 23-May-87 11:25:33 EDT References: <372@winchester.UUCP> <28200037@ccvaxa> Reply-To: mash@winchester.UUCP (John Mashey) Organization: MIPS Computer Systems, Sunnyvale, CA Lines: 98 I think this has mostly been answered by other posters, so I've missed most of the discussion, having been off in New Zealand [P.S., if you ever get a chance to attend the N.Z. UNIX conference, DO IT! well-run, great bunch of people, lot of fun, wonderful sightseeing, sheep jokes....] However, let me add a few notes on the above comments, plus a few more examples not already given by the other posters on this topic. In article <28200037@ccvaxa> preece@ccvaxa.UUCP writes: > mash@mips.UUCP: >> If somebody says "20 addressing modes are good", to be convincing,.... >---------- >I find it a little amusing that the same people who say "complex >feature x just isn't used very much" tend to be the same people who >say "not to worry, a sufficiently clever compiler will take care of >our ship's need for X". If compilers can be made smart enough to >handle some of the special things that RISCs need, they could be >made smart enough to make better use of the complex features in >CISCs. Good compiler technology is always useful; it's merely more useful on some machines than an others. Let's assume that one can have the same compilers on a variety of machines [both Stanford and IBM have done this]. The point is that good optimizing compilers change the statistics of what's going on, at least somewhat, on ANY machine, and if the statistics say that such compilers greatly lessen the use of a feature, you might think of eliminating the feature entirely, if there was a nonzero cost for it, and if the compilers have reasonable alternatives. Let's go back to the example that started all this, which was me claiming that "lots of addressing modes" needed justification as a good feature. [This was NOT a statement that lots of modes was necessarily bad, merely that it needed to be justified, because the published data seemed not to justify it. BTW, does anybody have some dynamic statistics on the multi-level indirect modes? What we've got so far is mostly static counts, which can be misleading.] For example, consider code like: if (a->b && a->b->c) ... [not uncommon] Suppose you have a machine that has all the indirect modes. If you have a non-optimizing compiler, but you special-case it to pick up the indirect modes, you can use them [and as somebody has pointed out, on some machines, there may be an implicit reference thru a frame pointer to get to a, and I'll assume that]. Depending on the machine, this could get you something like fetch a->b 1 memory ref to offset.of.a + (fp) 1 memory reference to offset.of.b + above test branch around if zero fetch a->b->c 1 memory ref to offset.of.a + (fp) 1 memory reference to offset.of.b + above 1 memory reference to offset.of.c + above test .... Suppose you have an optimizing compiler, which will surely do common subexpression elimination and serious register allocation, or it has no business calling itself an optimizer. What would it do? fetch a into r1 1 memory ref to offset.of.a + (fp) fetch b into r2 1 memory referenc to offset.of.b + (r1) test r2 branch around if zero fetch c 1 memory reference to offset.of.c + (r2) test... There are all sorts of variants, depending on the machine, and of course, it's quite possible that the optimizer might have decided "a" was a good thing to have in a register long before anyway, and amortized the cost of getting it there over several references. The point is that the first example has 5 address specifiers, and the second one has 3, and if the optimizer is at all lucky, it moved the first fetch away from the second one and got to re-use the value. On most machines I've seen, the 2nd case will go faster than the first, so what's happened is that some good machine-independent optimizations have reduced the utility of the specific machine feature [multi-level indirect addressing]. It's not that compilers can't be smart enough to take advantage of special features [I've done some ferocious hacking on compilers to do just that: once you have a machine, you do whatever you can!], but that given good optimizers, some features are of less use than others, because the optimizers change the statistics. At that point, you can make reasoned tradeoffs, but it's hard to do without a good understanding of what's likely to be possible for the compilers to do. > >The point isn't that RISCs make certain optimizations easier or harder, >but that they make certain optimizations NECESSARY. Compilers smart >enough to use some of the special features of CISCs haven't been >sufficiently necessary -- they work "well enough" using simple >instruction sequences. My impression from the literature is that RISCs >demand more compiler optimization to reach the performance that is >expected of them than do CISCs. Perhaps that simply means we have >higher expectations of them, perhaps it simply means that baseline >compiler performance is better than it used to be and those expectations >are reasonable. Whatever. Optimizations are by definition NEVER necessary! only desirable. We see something like 20% improvement from the more global optimizations, which is well worthwhile, since that's adding a few Mips to the performance, and some important cases sometimes get more. Nevertheless, the machines are still OK without this, and there's less weird machine-specific hackery by far than things I've seen done on many other machines. -- -john mashey DISCLAIMER: UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086