Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!pyramid!prls!mips!hansen
From: hansen@mips.COM (Craig Hansen)
Newsgroups: comp.arch
Subject: Re: RISC is a nasty no-no!
Message-ID: <1720@mips.mips.COM>
Date: 29 Feb 88 21:56:16 GMT
References: <179@wsccs.UUCP: <696@nuchat.UUCP> <284@scdpyr.UUCP> <25699@linus.UUCP>
Lines: 74
Summary: Not all RISC machines are as slow as SPARC

In article <25699@linus.UUCP>, bs@linus.UUCP (Robert D. Silverman) writes:
> In article <284@scdpyr.UUCP: cruff@scdpyr.UUCP (Craig Ruff) writes:
> :In article <696@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes:
> :>From article <179@wsccs.UUCP>, by terry@wsccs.UUCP (terry):
> :>[ lots of self-congratulation about how portable his code is, followed
> :>  by complaints that it isn't portable to the SPARC ]
> :>
> :>> THE REASON:  Type-casting.  You can't.
> :>
> :>FLAME ON!    ( I love this! )
> :>
> :>WRONG.  It is demonstrably NON-portable code - it failed to port
> :>to a working compiler on a reasonable machine.  If the bloody
> :>unix kernel runs (and it does) your silly application should, too.

This of course pre-supposes that the SPARC architecture yields
reasonable machines. While Sun claims that the Sun4 is source-code
compatible with the Sun3, what that really means is that if it ports
to the Sun4, it was portable, and if it doesn't port, it wasn't
portable. It's ridiculous to claim the Sun4 machine is source-code
compatible if _all_ software written for the Sun3 doesn't port,
as "portable" code written for the Sun3 would port to many machines
besides the Sun4 anyway.

In fact, the SPARC architecture has a real problem with source-code
compatibility with the Sun3 machines - the alignment rules are
different between the 68020 and SPARC, and code that depends on
misaligned data is hard to port to SPARC. The MIPS architecture and
compiler system is in a better position to port such code because
efficient instructions are available to handle unaligned (32-bit)
words, and our compiler system can be set to use them in such code. We
also provide utilities that can help to pinpoint where in the program
these problems occur, and can also fix up references to such unaligned
pointers within an exception handler, as an aid to porting the code
quickly and then going back to tune the code later.

> There's something about RISC architectures in general that I find 
> confusing. Since they (read SPARC or equivalent) have no integer multiply
> instructions, any code which has a fair number of these is going to
> be slow. This would include any program which had access to 2-D arrays
> since one must do multiplications (unless the array sizes are a convenient
> power of 2) to get the array indices right. Any code that accesses a[i][j]
> should run like a pig on such machines. I've seen some benchmarks that
> suggest SUN-4's are in fact slower than SUN-3's on programs that do a 
> large amount of integer multiplies/divides. What good is a computer that
> can't multiply?

All RISC architectures are NOT created equal, particulaly with respect
to integer multiply instructions. The MIPS R-Series processors have
explicit signed and unsigned integer multiply and divide instructions,
that are executed in special-purpose hardware. A 32-bit multiply takes
12 cycles, with up to 10 instructions that can be executed in parallel
with the multiply. We considered that to be a much superior solution
than multiply-step, which would have been slower and harder to
implement. (Multiply-step has too many operands and too many results.)

In many cases, 2-D arrays have sizes that are known at compile-time,
and so become multiplications by constants.  Multiplication by
constants can be handled efficiently by most RISC machines, but are
generally a little faster in cycle count on the HP "Precision"
architecture (I still think of it as Spectrum, but then I'm getting
old and set in my ways....), which has single-cycle shift-and-add
operations that are good for doing multiplies by constants that aren't
powers of 2.

The MIPS compiler picks either an explicit multiply operation or
software shift-and-add sequences, depending on the value and form
(variable vs constant) of the operands. The end result is that
multiplies are most often faster than the 12 cycle worst-case figure.

-- 
Craig Hansen
Manager, Architecture Development
MIPS Computer Systems, Inc.
...{ames,decwrl,prls}!mips!hansen or hansen@mips.com   408-991-0234