Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!pyramid!prls!mips!hansen From: hansen@mips.COM (Craig Hansen) Newsgroups: comp.arch Subject: Re: RISC is a nasty no-no! Message-ID: <1720@mips.mips.COM> Date: 29 Feb 88 21:56:16 GMT References: <179@wsccs.UUCP: <696@nuchat.UUCP> <284@scdpyr.UUCP> <25699@linus.UUCP> Lines: 74 Summary: Not all RISC machines are as slow as SPARC In article <25699@linus.UUCP>, bs@linus.UUCP (Robert D. Silverman) writes: > In article <284@scdpyr.UUCP: cruff@scdpyr.UUCP (Craig Ruff) writes: > :In article <696@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes: > :>From article <179@wsccs.UUCP>, by terry@wsccs.UUCP (terry): > :>[ lots of self-congratulation about how portable his code is, followed > :> by complaints that it isn't portable to the SPARC ] > :> > :>> THE REASON: Type-casting. You can't. > :> > :>FLAME ON! ( I love this! ) > :> > :>WRONG. It is demonstrably NON-portable code - it failed to port > :>to a working compiler on a reasonable machine. If the bloody > :>unix kernel runs (and it does) your silly application should, too. This of course pre-supposes that the SPARC architecture yields reasonable machines. While Sun claims that the Sun4 is source-code compatible with the Sun3, what that really means is that if it ports to the Sun4, it was portable, and if it doesn't port, it wasn't portable. It's ridiculous to claim the Sun4 machine is source-code compatible if _all_ software written for the Sun3 doesn't port, as "portable" code written for the Sun3 would port to many machines besides the Sun4 anyway. In fact, the SPARC architecture has a real problem with source-code compatibility with the Sun3 machines - the alignment rules are different between the 68020 and SPARC, and code that depends on misaligned data is hard to port to SPARC. The MIPS architecture and compiler system is in a better position to port such code because efficient instructions are available to handle unaligned (32-bit) words, and our compiler system can be set to use them in such code. We also provide utilities that can help to pinpoint where in the program these problems occur, and can also fix up references to such unaligned pointers within an exception handler, as an aid to porting the code quickly and then going back to tune the code later. > There's something about RISC architectures in general that I find > confusing. Since they (read SPARC or equivalent) have no integer multiply > instructions, any code which has a fair number of these is going to > be slow. This would include any program which had access to 2-D arrays > since one must do multiplications (unless the array sizes are a convenient > power of 2) to get the array indices right. Any code that accesses a[i][j] > should run like a pig on such machines. I've seen some benchmarks that > suggest SUN-4's are in fact slower than SUN-3's on programs that do a > large amount of integer multiplies/divides. What good is a computer that > can't multiply? All RISC architectures are NOT created equal, particulaly with respect to integer multiply instructions. The MIPS R-Series processors have explicit signed and unsigned integer multiply and divide instructions, that are executed in special-purpose hardware. A 32-bit multiply takes 12 cycles, with up to 10 instructions that can be executed in parallel with the multiply. We considered that to be a much superior solution than multiply-step, which would have been slower and harder to implement. (Multiply-step has too many operands and too many results.) In many cases, 2-D arrays have sizes that are known at compile-time, and so become multiplications by constants. Multiplication by constants can be handled efficiently by most RISC machines, but are generally a little faster in cycle count on the HP "Precision" architecture (I still think of it as Spectrum, but then I'm getting old and set in my ways....), which has single-cycle shift-and-add operations that are good for doing multiplies by constants that aren't powers of 2. The MIPS compiler picks either an explicit multiply operation or software shift-and-add sequences, depending on the value and form (variable vs constant) of the operands. The end result is that multiplies are most often faster than the 12 cycle worst-case figure. -- Craig Hansen Manager, Architecture Development MIPS Computer Systems, Inc. ...{ames,decwrl,prls}!mips!hansen or hansen@mips.com 408-991-0234