Path: utzoo!attcan!uunet!samsung!uakari.primate.wisc.edu!sdd.hp.com!decwrl!sgi!vjs@rhyolite.wpd.sgi.com
From: vjs@rhyolite.wpd.sgi.com (Vernon Schryver)
Newsgroups: comp.arch
Subject: Re: int x int -> long for * (or is it 32x32->64)
Keywords: arithmetic,arbitrary precision,benchmark,modular arithmetic
Message-ID: <69434@sgi.sgi.com>
Date: 15 Sep 90 02:38:11 GMT
References: <3984@bingvaxu.cc.binghamton.edu> <41425@mips.mips.COM> <41497@mips.mips.COM>
Sender: guest@sgi.sgi.com
Organization: Silicon Graphics, Inc., Mountain View, CA
Lines: 26

In article <41497@mips.mips.COM>, mash@mips.COM (John Mashey) writes:
> ....
> 	-Apparently simple-looking additions can have surprisingly
> 	ugly and widespread effects


Not to disagree, I have many times bemoaned the lack of some kind of fast
carry indication from the MIPS add instruction while looking for ways to
make the TCP/IP checksum faster.  If a single instruction could could compute
	    32-bit + 32-bit -> 32-bit with carry-bit,
you could substantially better the number instructions in all of the
algorithms I've found.  The savings of having such a facility, counted in
seconds, would be reduced by apparently unavoidable cache delays.  That
does not stop the moaning.

One can construct tighter checksum loops on the 29K, 68K, and the 80386.

Given operating system tricks on the byte copies, computing the checksum is
the bigger of the two major places the code I know spends most of its time.


My limited and decades past dabbling with general purpose long arithmetic
makes me ask, would there be a similar benefit to extended precision addition?


Vernon Schryver     vjs@sgi.com