Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site spar.UUCP
Path: utzoo!watmath!clyde!cbosgd!ihnp4!mhuxn!mhuxr!ulysses!gamma!epsilon!zeta!sabre!bellcore!decvax!decwrl!spar!freeman
From: freeman@spar.UUCP (Jay Freeman)
Newsgroups: net.arch
Subject: Re: risc, cisc, and microprogramming
Message-ID: <350@spar.UUCP>
Date: Fri, 21-Jun-85 14:34:06 EDT
Article-I.D.: spar.350
Posted: Fri Jun 21 14:34:06 1985
Date-Received: Mon, 24-Jun-85 02:26:24 EDT
References: <557@hou2b.UUCP> <1078@peora.UUCP> <334@spar.UUCP> <145@mips.UUCP>
Reply-To: freeman@max.UUCP (Jay Freeman)
Organization: Schlumberger Palo Alto Research, CA
Lines: 62

[beware of the line-eat<*CHOMP*>]

In article <145@mips.UUCP> larry@mips.UUCP (Larry Weber) writes:
>> 
>> It is also "difficult" to write system software for that subclass of RISC
>> machine that simplifies hardware to the point of requiring the software to
>> allow for the resolution of pipeline data-dependencies.  It would be an
>> Interesting Task to do a code-generator for such a beast.  
>> ... 
>> [from me -- JF]
>
>This is simply not true.  [...]  The problem of code generation needn't be
>any harder than in other machine [...]

You are certainly correct that putting in NOOPs provides a quick and dirty
resolution of these dependencies.  I erred in not stating precisely, that
I meant "doing it right".   I believe it is important to do it right; because
adding substantial numbers of NOOPs to code -- possibly as many or more than
the code has "real" instructions -- would appear likely to slow up execution
sufficiently that a RISC no longer has a speed advantage over a CISC.
(I have no data -- is that correct?)

I will also accept that a peephole optimizer for removing pipeline
dependencies need not be longer than a few hundred lines.  However, at least
some global data-flow analysis appears desirable, to ensure that cycles
are not wasted in unnecessary resolution after jumps and calls:

(It is well known that many programs spend most of their time in
 relatively short loops -- adding NOOPs needlessly at the start of
 each loop could hurt.  (Again, no data -- key issues would appear to be
 how short the "typical" loop is, how many NOOPs the particular RISC
 requires to flush its pipelines, and what actual advantage the RISC with
 fully-optimized code has over its competitors, anyway.))

(A similar issue may exist with respect to call overhead.)

Notwithstanding, I think I still see a problem with respect to getting good
system software for these beasties.  I reason as follows:  Much of system
software requires well-optimized code.  Therefore we either write it in a
high-level language for which we have a compiler that generates good code,
or else we go in and hand-code the major bottlenecks in assembler.

I submit that optimizing pipeline dependencies by hand, in assembler, is
likely to provide a rich source of subtle, persistant and fascinating bugs.
Great fun [Insert :-) here, I think], but not necessarily conducive to lots
of good code.

I also suggest that decent optimizing compilers are not really
across-the-board state-of-the-art for the microcomputer industry yet.  I
have a sense that although they exist, they are often slow to come out, slow
to be updated and made bug-free, and relatively scarce.  If the Acme RISC
Machine Co. introduces a new wonder chip, how long will it be before such
tools are available in sufficient variety to suit all the eager developers?

To the extent that these arguments are convincing, I am induced to speculate
that the success or failure of computer systems based on RISC-machine
hardware, may be embarrassingly tightly linked to the state of the
system-software industry.

I an not sure that my arguments are correct, and I am not confident of my
perception of the software industry, either.  What do other people think?
-- 
Jay Reynolds Freeman (Schlumberger Palo Alto Research)(canonical disclaimer)