Path: utzoo!attcan!uunet!van-bc! From: lphillips@lpami.wimsey.bc.ca (Larry Phillips) Newsgroups: comp.sys.amiga.tech Subject: Re: Parity Checking / ECC RAM on the A3000 Message-ID: <1668@lpami.wimsey.bc.ca> Date: 1 Jun 90 01:00:51 GMT Lines: 74 Return-Path: To: van-bc!rnews In , dillon@overload.UUCP (Matthew Dillon) writes: >>Parity schemes, on the other hand, cannot detect the failure of a parity bit >>itself, and thus reduces the overall reliability as a tradeoff for knowing when > > A parity scheme will detect all one bit errors, even if the bit that > error'd is the parity bit itself. The parity scheme does not know *which* > bit err'd, or whether it was the parity bit itself that err'd, but it > will detect any single bit error. > > A reasonable ECC scheme (7 bits to correct 32 bits as I mentioned in my > previous posting) will detect and correct all 1 bit errors where that 1 > bit is any one of the 32 bits. It will detect any single bit error in > the ECC code itself in which case the real data is assumed to be valid > and no other action is taken. I believe the scheme will also detect any > two bit errors (through all 39 bits). > > One should never think of an ECC scheme in terms of whether the erronous > bits are in the ECC part or the real-data part. Or, at least, I never > think of it that way. You tend to produce weak algorithms when you > consider cases that depend on the meaning of bits rather than work on > a general algorithm that can do a better job all around. Right. One does not need to think about which bit has failed in an ECC memory access, because the data is always assumed to be correct when it arrives at the destination. With ECC, all bits must be assumed to be equally important, since you are depending on all bits in order to make the above assumption. My comment had to do with parity schemes, where there is no choice. Since you cannot know which bit failed, you cannot assume the data is intact at the destination. From this, you can unequivocally state that the addition of 1/8 more bits to the memory has done one thing for you, and one thing to you. The thing it has done for you is to tell you that _some_ bit did not read out correctly. The thing it has done _to_ you is increased the chances of a bit being read out wrong. The point is not whether you should know if a prity bit fails, but that if it did happen to be a parity bit that failed, it was a 'useless' error; one that would not have happened if you did not have parity checking. Again, on average, 1/9 of the errors will fall into this category, though you will not know which ones they are. >>you had an error, even if that error is meaningless and would not have happened >>without the parity bit being present. Statistically speaking, if parity is > > Thinking of things that way will wind you into a corner fast! Not at all. It will only get you into trouble if you try to divine which bit failed, and act upon it. We don't get ourselves in trouble just because we posess the knowledge that approximately 1/9 of all errors are spurious. :-) >>have a lot of choice. With an ECC scheme, the system can make note of the error >>and keep using the memory, allowing it to map the page out when the number of >>errors exceeds a threshhold over a predefined period of time. It will also >>allow reporting of single bit errors to the operator, who can make a good >>judgement as to the root cause, and take action as appropriate. > > This is one good use of ECC.. .to detect failing memory. Some of the memories I have worked on have had literally thousands of memory chips (imagine 96 megs worth of 4KBit chips), and the ECC was invaluable for detecting a degenerating chip. We kept accurate records of all single bit errors, provided by the memory itself and stored for readout at PM time, and it was quite easy to discard the 'random event' failures, and to catch anything that was on its way to being a ssolid error. Replacing the chip before it got to be a solid problem meant a saving of many hours trying to track down double-bit errors, which were MUCH harder to isolate. -larry -- The raytracer of justice recurses slowly, but it renders exceedingly fine. +-----------------------------------------------------------------------+ | // Larry Phillips | | \X/ lphillips@lpami.wimsey.bc.ca -or- uunet!van-bc!lpami!lphillips | | COMPUSERVE: 76703,4322 -or- 76703.4322@compuserve.com | +-----------------------------------------------------------------------+