Path: utzoo!attcan!uunet!snorkelwacker!paperboy!meissner
From: meissner@osf.org (Michael Meissner)
Newsgroups: comp.sys.m88k
Subject: Re: Emulating other computers on 88K's and Benchmarks
Message-ID: <MEISSNER.90Oct13222526@osf.osf.org>
Date: 14 Oct 90 02:25:26 GMT
References: <1990Oct3.095041.9295@canterbury.ac.nz> <newton.655360878@smoggy>
	<41965@mips.mips.COM> <1990Oct11.174838.7990@unx.sas.com>
Sender: news@OSF.ORG
Organization: Open Software Foundation
Lines: 80
In-reply-to: sasrer@unx.sas.com's message of 11 Oct 90 17:48:38 GMT

In article <1990Oct11.174838.7990@unx.sas.com> sasrer@unx.sas.com
(Rodney Radford) writes:

| But there are cases when having the same register set for both the general
| purpose registers and the floating point registers can offer improvements
| in the code by allowing you to operate on the floating point values with the
| same integer instructions (for example: using some of the specialized bit 
| manipulation instructions).  Also, the chips listed above that have the 
| 'extra' FP registers you mention are actually on external floating point 
| coprocessor chips, so they should not be included in the register count when
| comparing specific RISC processors. The choice of whether to use an external
| floating point coprocessor is a system designers choice, not a specific
| RISC chip requirement (it is possible for an 88K to also be hooked to an 
| external math coprocessor, although I have not heard of such an arrangement). 

I worked on GCC for the 88k for 1 year, and for the MIPS chips for 1
year, so I have a little experience in both sides.  :-)

The statement about operating on floating point values with integer
instructions is a bit of a red herring.  If you have to support a
signaling NaN, the code sequence to check for the NaN wipes out any
savings by using the faster integer instructions.

Whether or not a FPU is implemented via a separate chip or not, is
immaterial.  I'm not aware of ANY vendor who uses MIPS chips which
does not include a FPU.  The question is does the intstruction set
hinder or help to run the task at hand.

Separating the register sets is helpful, because you've just doubled
the number of registers without changing instruction formats.  When I
was a Data General, we did some hand checking, and found that
unrolling loops could not keep the machine going at full tilt, because
you run out of registers too quickly.  I wouldn't bet that the 88k
will have a unified register set forever....

The only time that I've wished the MIPS had a unified register set was
in dealing with varargs functions where you would like to be able to
store all unknown arguments on the stack, and walk a pointer.
However, the 88k doesn't win any points in this arena, because the
Greenhills inspired 88K OCS demands that you have two separate arrays,
va_list is a 3 word structure, and the va_arg macro continually has to
check whether or not the argument is in the first 8 words or not....

Like most people, I find the current generation of RISC chips to be
fairly similar.  However, as I compiler writer there are things about
each of the two processors that I like and dislike:

Things that I like about the 88k that aren't in MIPS:

    *	reg+reg, and reg+(reg*base_size) addressing modes.
    *	bit extraction operators (except no bit field set).
    *	pure PC-relative jumps/subroutine calls.
    *	branch insns w/optional delay slot (saves code space, not time).
    *	and.u, or.u, xor.u instructions.
    *	standard calling sequence has 13 saved regs instead of 9.
    *	standard calling sequence passes 8 words in regs instead of 4.
    *	better conversion ops (esp. int<->floating point).
    *	hardware interlocks.
    *	pipeline multiply rather than multiply unit.
    *	add/subtract with carry.
    *	assembler supports creating debug information.

Things I like about the MIPS that aren't in the 88k:

    *	more registers, since FPU regs are separate.
    *	signed division doesn't require branches to fix up sign/avoid traps.
    *	modulus operation without having to do a - ((a/b)*b).
    *	a divided by b gives a/b and a%b at same time.
    *	32x32->64 bit multiply.
    *	small data/bss area (I think this is just coming to the 88k).
    *	standard calling sequence passes structs in regs, not in the stack.
    *	branch on a==b and a!=b are each one instruction.
    *	assembler temporary register.
    *	only 1 cycle delay after load rather than 2.
    *	ECOFF debug format is slightly more expressive than COFF debug format.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?