Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!amdcad!rpw3 From: rpw3@amdcad.AMD.COM (Rob Warnock) Newsgroups: comp.protocols.misc Subject: Re: CRC vs data length Message-ID: <25209@amdcad.AMD.COM> Date: 13 Apr 89 02:06:09 GMT References: Reply-To: rpw3@amdcad.UUCP (Rob Warnock) Organization: [Consultant] San Mateo, CA Lines: 58 fdc@WATSUN.CC.COLUMBIA.EDU (Frank da Cruz) writes: +--------------- | In all the literature I've encountered on the Cyclic Redundancy Check, proofs | of its effectiveness have not discussed the length of the data being checked. | ... And this leads to the question, is there an optimum message length for | a given checking polynomial and error rate? Obviously, the longer the | message, the more efficient the use of the medium provided retransmissions can | be kept to a minimum... +--------------- Yes, CRC's are designed for an upper limit to message length. The upper limit for CRC-16 is -- not surprisingly -- 2^16 bits, or 8192 bytes. (But note that the CRC itself must be included in the 8192 bytes.) But multiple errors are *much* more likely to be detected if the block is much smaller than the maximum. And of course, the probability of multiple errors depends on the error rate and the block length. More interestingly, there is an optimum block length which depends strongly on the error rate, the retransmission scheme, and the link delay -- but *not* particularly on the CRC, assuming it's strong enough for the maximum block length. This is often shown in books on error-correcting codes as a family of throughput-vs-blocklength curves, and a typical curve is steadiliy rising at first, then peaking, and tailing off to a steep dive back down. Look in some of the older books for this, as the newer ones tend to concentrate on error-correction rather than ARQ. [Sorry I don't have any numbers for you.] Selective retransmit helps, as do full duplex channels, forward-error-correction codes, and other "advanced" features. Two practical notes: 1. CRC-32 (the "Ethernet" checksum) is very easy to do by table lookup, and should probably be used instead of CRC-16 for any new protocol. It requires a lookup table of 256 32-bit words. Typical C code for each byte is: crc = (crc << 8) ^ CRC32table[ (crc >> 24) ^ new_char]; Code has been posted to the net for this (several times!). 2. If the error rate is fairly high, and the link delay large, you may want to consider some limited error-correction. There is a simple Reed-Solomon code used in the IBM 3370 disk [described on pp.528-531 of "Error Control Coding" by Lin & Costello] which should be really easy to do in software by table lookup. You need two 256-byte tables, and two table lookups and three XORs per data byte. This code will correct an entire byte being zapped. (You should probably interleave it 3-4 ways.) If the "data" also includes an internal CRC-32, you'll have pretty good protection against mis-correction. I'm considering using this for a version of SLIP. [News at 11...] Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403