Path: utzoo!attcan!uunet!samsung!uakari.primate.wisc.edu!sdd.hp.com!decwrl!sgi!vjs@rhyolite.wpd.sgi.com From: vjs@rhyolite.wpd.sgi.com (Vernon Schryver) Newsgroups: comp.arch Subject: Re: int x int -> long for * (or is it 32x32->64) Keywords: arithmetic,arbitrary precision,benchmark,modular arithmetic Message-ID: <69434@sgi.sgi.com> Date: 15 Sep 90 02:38:11 GMT References: <3984@bingvaxu.cc.binghamton.edu> <41425@mips.mips.COM> <41497@mips.mips.COM> Sender: guest@sgi.sgi.com Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 26 In article <41497@mips.mips.COM>, mash@mips.COM (John Mashey) writes: > .... > -Apparently simple-looking additions can have surprisingly > ugly and widespread effects Not to disagree, I have many times bemoaned the lack of some kind of fast carry indication from the MIPS add instruction while looking for ways to make the TCP/IP checksum faster. If a single instruction could could compute 32-bit + 32-bit -> 32-bit with carry-bit, you could substantially better the number instructions in all of the algorithms I've found. The savings of having such a facility, counted in seconds, would be reduced by apparently unavoidable cache delays. That does not stop the moaning. One can construct tighter checksum loops on the 29K, 68K, and the 80386. Given operating system tricks on the byte copies, computing the checksum is the bigger of the two major places the code I know spends most of its time. My limited and decades past dabbling with general purpose long arithmetic makes me ask, would there be a similar benefit to extended precision addition? Vernon Schryver vjs@sgi.com