Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!necntc!ames!sdcsvax!ucbvax!imagen.UUCP!geof From: geof@imagen.UUCP (Geof Cooper) Newsgroups: comp.protocols.tcp-ip Subject: TCP checksum unrolling Message-ID: <8710062308.AA00107@apolling.imagen.uucp> Date: Tue, 6-Oct-87 19:08:57 EDT Article-I.D.: apolling.8710062308.AA00107 Posted: Tue Oct 6 19:08:57 1987 Date-Received: Fri, 9-Oct-87 23:51:58 EDT Sender: daemon@ucbvax.BERKELEY.EDU Reply-To: imagen!geof@decwrl.dec.com Distribution: world Organization: The ARPA Internet Lines: 62 Here is one, undebugged, that illustrates the concept. It also uses the trick that (if you have it) you can use 32-bit two's complement addition and add all the carries in at the end (another trick that is sometimes faster is to generate a 32-bit one's complement sum and then add the top and bottom halves together to get the 16-bit sum). Some C compilers won't accept the wierd syntax below; or maybe I should point out, as you wretch on the floor, that there is at least ONE c compiler that DOES accept this syntax. It is trivial to code it for all C compilers -- but what you really want to do is code the exact intent of the following into assembly language. That makes it a lot faster to add the two halves of a 32-bit word. These tricks don't work for XNS checksums. Our experience is that this difference alone makes our XNS implementation a little slower than our TCP implementation on a 68000. - Geof checksum(p, n) unsigned short *p; short n; { short nloop; short nrem; unsigned long sum; sum = 0; if ( n > 0 ) { nloop = (n >> 3) + 1; nrem = n & 7; switch ( nloop ) { do { sum += *p++; case 7: sum += *p++; case 6: sum += *p++; case 5: sum += *p++; case 4: sum += *p++; case 3: sum += *p++; case 2: sum += *p++; case 1: sum += *p++; case 0: } while ( --nloop > 0 ); } } sum = (sum >> 16) + (sum & 0xffff); sum = (sum >> 16) + (sum & 0xffff); return ( sum ); }