Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!mit-eddie!ll-xn!ames!amdahl!oliveb!amiga!mitsumi!neil From: neil@mitsumi.UUCP (Neil Katin) Newsgroups: comp.unix.wizards,comp.arch Subject: Re: Double-bit errors and ECC memory Message-ID: <265@mitsumi.UUCP> Date: Fri, 11-Sep-87 03:17:30 EDT Article-I.D.: mitsumi.265 Posted: Fri Sep 11 03:17:30 1987 Date-Received: Sat, 12-Sep-87 16:32:30 EDT References: <1184@itm.UUCP> <797@spar.SPAR.SLB.COM> <2891@phri.UUCP> Reply-To: neil@mitsumi.UUCP (Neil Katin) Organization: Mitsumi Technology Inc Lines: 38 Xref: mnetor comp.unix.wizards:4197 comp.arch:2143 ->In article <797@spar.SPAR.SLB.COM> hunt@spar.UUCP (Neil Hunt) writes: -> The way most (all?) modern memory systems are built is to have each ->chip contribute a single bit to each of many words. Thus, a typical 1 ->Mbyte ECC board (small by today's standards) might consist of 39 256k ->chips, each chip contributing a single bit to each of the 256k 39-bit words ->(32 data plus 7 EEC bits) on the board. If several bits in a given chip ->were to go bad, you would see errors in the same bit of several different ->words. If an entire chip were to die, you would see an error in the same ->bit of *every* word on the board. The memory controller would be able to ->correct any of these problems. -> -> Note that the typical-but-mythical memory board described above ->has 7 check bits per 32 bit data word. Since you need 2N+1 check bits to ->correct an N-bit error, this board should be able to detect and correct as ->many as 3 bad bits in any 32-bit word. Thus, you could, if you wanted, go ->so far as to pluck out any 3 RAM chips on the board without loosing any ->function (other than, maybe, access speed). ->-- ->Roy Smith, {allegra,cmcl2,philabs}!phri!roy ->System Administrator, Public Health Research Institute ->455 First Avenue, New York, NY 10016 Sorry, I don't believe that is correct. As I understand error correcting codes, It takes at least ln(m) bits to protect an m bit data word from a one bit error. That means that you three bits to protect a byte, and five bits to protect a 32-bit word. I think (e.g. its been a while since I did the math) that seven bits is enough to protect against two bit errors for a 32 bit word. The place where "2N+1" comes in the the "error distance" needed to map an erroneous data word back to a correct one. There is basically a tradeoff between pure detection (distance N+1) and correction (2N+1). In other words, if you could either correct a two bit error or detect a four bit error with the same number of code bits.. Neil Katin {amiga,pyramid}!mitsumi!neil