Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!brutus.cs.uiuc.edu!zweig
From: zweig@brutus.cs.uiuc.edu (Johnny Zweig)
Newsgroups: comp.protocols.tcp-ip
Subject: TCP Fletcher Checksum Option
Keywords: checksum Fletcher OSI
Message-ID: <1989Nov29.160929.10160@brutus.cs.uiuc.edu>
Date: 29 Nov 89 16:09:29 GMT
Sender: news@brutus.cs.uiuc.edu
Reply-To: zweig@cs.uiuc.edu
Distribution: comp
Organization: U of Illinois, CS Dept., Systems Research Group
Lines: 50

In re-reading the article ``Improving the Efficiency of the OSI Checksum
Computation'' in the latest Computer Communication Review (ACM SIGCOMM),
I remembered an idea that popped up over lunchtime several months ago.

It seems to me that I could add 4 or 5 lines of code to my TCP-checksum
computation routine (basically a 16-bit-at-a-time iteration over the
appropriate bytes) and have it compute both the regular TCP/IP checksum
and a 16-bit Fletcher checksum (by simply using an additional accumulator)
at a small additional cost (well, twice as many additions, so it's not really
such a small cost, I'm sure it will be said -- see below).

I would like to hear opinions on whether a TCP header option-pair (one to
negotiate use of such a Fletcher checksum on connection-establishment, and
one to carry the actual extra 2 bytes in each segment) would be a
reasonable thing to propose.

Not to slam Postel and the whole IETF and everyone else who brought us the
TCP/IP checksum algorithm, but the fact that it is not able to detect
the transpostion of N (where N>=2 is an even number) octets in a datagram
is of concern in some applications where ultra-reliable communication is
essential.  Also, it has been pointed out by a number of hardware-types I
have talked to that these sorts of errors can occur, for example, when
DMAing bytes from an ethernet-controller to the main memory of a machine --
i.e. after the ethernet CRC has already been checked. One case it to have
a pair of 32-bit words go out over the bus in the wrong order under rare
timing glitches....

I know, I know, there is probably a stack of printout somewhere gathering
dust (being from 1967) giving good data about the fact that this Almost 
Never Happens in Real Life -- but it seems like an option that most
implementations would not support (I think this ought to be an "optional"
feature in the hypothetical revision of the Host Requirements RFC that gets
put out after the hypothetical RFC describing this option is hypothetically
released) would not cost anything to anyone, other than to the implementations
that feel it is appropriate to use it.  The added security and assurance of
reliable communication seem to me to be (a) an easy option to add to the
protocol-suite, (b) demonstrably sufficient for ultra-reliable applications
(the 16-bit Fletcher algorithm is very good at detecting errors -- see
pp.35/36 of CCR v.19 no.5, etc.), and (c) would not cost anything not to use,
since existing option-processing would reject any attempts to use the
option if unsupported (i.e. doing vanilla TCP would not be changed at all,
and for a Fletcher-talking version to realize it can't talk Fletcher to
a site would only entail the transmission of a couple of packets; presumably
it would be very rare to try to talk Fletcher to someone who isn't known
in advance to support it).

So it's cheap, it's simple, it's reliable, it's easy to code. Why shouldn't
it be a TCP option?

-Johnny Reliable


Brought to you by Super Global Mega Corp .com