Path: utzoo!attcan!uunet!decwrl!sdd.hp.com!hplabs!hpcc05!hp-ptp!marcb
From: marcb@hp-ptp.HP.COM (Marc Brandis)
Newsgroups: comp.arch
Subject: Re: Re: a style question
Message-ID: <1300002@hp-ptp.HP.COM>
Date: 3 Oct 90 01:06:28 GMT
References: <1990Oct2.151644.1581@phri.nyu.edu>
Organization: HP Pacific Technology Park - Sunnyvale, Ca.
Lines: 49

meissner@osf.org (Michael Meissner) writes:

>In article <1990Oct2.151644.1581@phri.nyu.edu> roy@phri.nyu.edu (Roy
>Smith) writes:

>| 	Are there actually any machines in which a compare-and-branch for
>| inequality is any faster or slower than a compare-and-branch for less-than?
>| It seems to me that either should take one pass through the ALU to do the
>| comparison and set some flags, so they should both take the same amount of
>| time.  I'm basing my assumption on experience with pdp-11 type machines,
>| but I find it hard to imagine any other machine being significantly
>| different.  Maybe if you had an asynchronous ALU?

>Yes, any computer based on the MIPS chipset (MIPS, DECstation, SGI) is
>faster to do a branch on both equality and inquality, than for the
>other comparison operators.

>Mips does not have a branch on a < b (unless b is 0).  It has a set
>register to 1 if a < b instruction (& 0 otherwise).  Thus to do the
>branch, you set the scratch register to be the value of a < b, and
>then do a branch on that register being zero.  It does have a direct
>instruction to branch if two registers are equal or not equal.

I do not know the actual reasons for which the MIPS has been designed
like this.    

A possible reason may be that a compare for equality or inequality can
in fact be implemented faster than an arithmetic compare, as you need
basically an adder to do the arithmetic comparison. In order to compare
for inequality, you need only a bunch of XOR gates wired together. As
this is so fast, the comparison can be done in an early pipeline stage
(e.g. just after the register fetch). The data does not have to be
feeded through the ALU. You can win one cycle using this, as the
result of the comparison and therefore the outcome of the branch is
known earlier.

I remember having read about one machine being designed like this, but
I do not remember which one.


(* I speak only for myself.
	Marc-Michael Brandis
	Institut fuer Computersysteme
	ETH Zentrum
	CH-8092 Zuerich, Switzerland
	e-mail: brandis@inf.ethz.ch
		brandis@iis.ethz.ch
   Temporarily at HP, marcb@hp-ptp.ptp.hp.com
*)