Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!usc!elroy.jpl.nasa.gov!decwrl!megatest!djones
From: djones@megatest.UUCP (Dave Jones)
Newsgroups: comp.lang.c
Subject: CRC-16 in C
Message-ID: <14381@rabbit.megatest.UUCP>
Date: 31 Oct 90 01:08:46 GMT
Organization: Megatest Corporation, San Jose, Ca
Lines: 44

Recently, in connection with the topic of hashing, Ron Irvine
posted a C algorithm for calculating the CRC-16 number for
a null-terminated string. I decided to grab it and put it
in my utilities library. His routine was based on two sixteen-entry
tables, because, he said, they would fit in cache. Well one
machine I use has a much larger cache than that, and the other
has none, so I decided just for the fun of it to convert the algorithm
to one based on one 256-entry table. I did that, and testing
has verified that it produces the same results.

Now then, for purposes of documentation, and to verify the original
algorithm, (and for my own education I guess), I decided to figure
out what the thing actually does. I borrowed a book, _Computer Networks,
Second Edition_, by Andrew S. Tanenbaum, (Simon & Schuster) and dug
in. I coded up the algorithm they gave, using the example on page
210, fig. 4-7.(*) My code printed out intermediate results exactly
matching those in the example. So far so good. Now I changed the
two constants that in theory should change the example check-sum into the
CRC-16 checksum. The result was that I got different numbers from
my algorithm than from the posted one. Could this be because of some
Big-endian/Little-Endian kind of disagreement? Does the posted algorithm
in effect shift the bits out least-significant bit first, for example?
Also the book specifies that the input is first tagged at the end with
16 zero-bits before doing the calculation. Does the algorithm published
on the net effectively do that? It does not appear to, but I confess I
have not yet figured out why the posted algorithm does anything similar
to what the book describes.

I guess what I am asking for is the "official" definition of
a CRC-16 checksum as it applies to byte-streams rather than bit-streams,
and why I don't get the same numbers that the posted algorithm gets.

In another posting, you'll find a shar of the files under discussion
here, including the "bit-shifting" toy algorithm based on what I
read in the book.


(*) Actually I had to change it a little, because the algorithm described
    was obviously wrong. It says, "A divisor is said to 'go into'
    a dividend if the dividend has as many bits as the divisor."
    I interpreted that to mean instead, "if the dividend and divisor have
    the same base-two log," which is consistent with the example,
    and with the statement that the algorithm can be implemented with
    "a simple shift register circuit".