Path: utzoo!attcan!uunet!cs.utexas.edu!uwm.edu!rpi!julius.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew
From: aglew@crhc.uiuc.edu (Andy Glew)
Newsgroups: comp.arch
Subject: Re: importance of carry logic (was 32 x 32 -> ?? multiply)
Message-ID: <AGLEW.90Sep17122306@dwarfs.crhc.uiuc.edu>
Date: 17 Sep 90 17:23:06 GMT
References: <3984@bingvaxu.cc.binghamton.edu> <41425@mips.mips.COM>
	<AGLEW.90Sep16115215@dwarfs.crhc.uiuc.edu>
	<1990Sep17.062626.12006@quick.com>
Sender: news@ux1.cso.uiuc.edu (News)
Organization: Center for Reliable and High-Performance Computing University of
	Illinois at Urbana Champaign
Lines: 21
In-Reply-To: srg@quick.com's message of 17 Sep 90 06:26:26 GMT

>>     Ie. based on my experience coding in_chksum(), but not having
>> coded it on a MIPS, I would estimate that the slowdown through not
>> having carry out and in is approximately 3-fold wrt. good code that
>> uses carry-out and in. But this is only an upper bound, because
>> overhead of call, etc., gets in the way.
>
>Actually, the upper bound on the MIPS would be a 2-fold increase.  All
>you have to do is compute the checksum 2 bytes at a time and let all
>the carries accumulate in the upper half of a register.  When you
>get to the end you fold them around to the lower half (twice!) and
>you're done.  I can think of several strategies for improving this
>figure by some small increment (eg - loop unrolling is likely to
>be slightly more profitable given that you'll be processing twice
>as many data items).

Yeah, I thought of this just after posting.  Most protocols set an
upper limit on buffer size, so you don't need to check for overflow
from the upper 16 bits.
    
--
Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]