Path: utzoo!attcan!uunet!snorkelwacker!think!linus!eachus From: eachus@linus.mitre.org (Robert I. Eachus) Newsgroups: comp.sys.amiga.tech Subject: Re: Parity Checking / ECC RAM on the A3000 Message-ID: Date: 8 Jun 90 20:43:16 GMT References: <1990May27.101258.24470@zorch.SF-Bay.ORG> <1410047@hpcvca.CV.HP.COM> Sender: usenet@linus.mitre.org Organization: The Mitre Corporation, Bedford, MA Lines: 58 In-reply-to: charles@hpcvca.CV.HP.COM's message of 7 Jun 90 20:21:16 GMT In article <1410047@hpcvca.CV.HP.COM> charles@hpcvca.CV.HP.COM (Charles Brown) writes: > If the ECC RAM returns the correct value but too late, it is not > designed correctly. Part of the task of design is to make sure there > is enough margin. So you have not demonstrated your point. What you > have demonstrated is that: Poorly designed ECC RAM is sometimes less > reliable than well designed RAM w/o ECC. So what. Way off... With modern memory parts everything is quantitized and statistical since the charges being moved are on the order of 20,000 times the charge of an electron. In theory (but very, very rarely) sometimes you will get no electons willing to move and read a one as a zero. Much more likely is that the charge in a particular cell is represented by significantly fewer than the average number of electrons. This will result in a slower rise-time when the cell is read, so you must allow some slack in your design for this "jitter" in the rise time. How much? 6 Sigma? 10 Sigma? Whatever number you choose there is some statistical chance that and error will occur because the values were latched too early. EDAC is usually done without latching the values, since latching will usually add a clock cycle to the memory delay. If there is an complete error in a single bit in EDAC memory no problem. However, a "slow" bit can result in the output of an EDAC PAL being wrong at precisely the wrong time, even though it was correct a nanosecond earlier, and will be correct a nanosecond later. (The logic paths through a PAL are often different lengths, and other effects can also add jitter to the signal.) This was the effect I was referring to when I said that these late bit errors more often occur when a bit is being corrected. Since the EDAC circutry adds timing uncertainty to a memory system, and also slows it down, it is much more difficult to allow a 10 Sigma margin on EDAC circutry. (Six sigma gives you one error per billion, or about one every 3 minutes on a memory read at 5 MegaHertz. At nine sigma a fifty per cent increase in MARGIN which might be an added 3 ns. delay without EDAC, or an added 10 ns. with EDAC, an error will occur every 20,000 years.) To repeat myself, a good EDAC memory today is unlikely to be significantly better than a well designed memory with no checking. But now to change the subject: If you have all accounting data entered then verified by a separate operator. You can get the error rate down to 1 in 10,000 for keyboard input, so increasing the probability of error by one-millionth by using a machine without parity checking is in the noise, especially since, when accounting is done on PC's usually the operator checks his or her own input. So when Bob Silverman down the hall wants to factor a 100+ digit number using several decades of machine time, he cares about self-checking and memory error rates. Accountants don't. -- Robert I. Eachus with STANDARD_DISCLAIMER; use STANDARD_DISCLAIMER; function MESSAGE (TEXT: in CLEVER_IDEAS) return BETTER_IDEAS is...