Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!uunet!mcsun!ukc!cam-cl!tdw From: tdw@cl.cam.ac.uk (Tim Wilson) Newsgroups: comp.periphs Subject: Disc reliability and ECC Summary: How much can ECCs improve HDA reliability? Keywords: disc disk reliability MTBF ECC error Message-ID: <1658@gannet.cl.cam.ac.uk> Date: 16 Nov 89 20:36:31 GMT Sender: news@cl.cam.ac.uk Reply-To: tdw@cl.cam.ac.uk (Tim Wilson) Organization: U of Cambridge Comp Lab, UK Lines: 42 I am looking for information about calculated or measured hard error rates from hard discs. I am conducting some experiments on the reliability of non-volatile (ie battery backed) primary memory, and would like to compare my results with the more normal form of non-volatile memory. When I pick up a disc manual, I read that the hard error rate is (say) less than 10 bits in 10 E -14. I then read that the ECC can correct up to (say) one burst of up to ten corrupted bits. What I have never seen stated is the final hard error rate *after* application of ECC---the unrecoverable error rate as seen by the operating system. == More detail on the experiments, in case you'd like to know == Non-volatile primary memory (NVM) has some appealing possibilities as a component of file systems. It could be used to eliminate the danger of inconsistency resulting from write-behind caches; it could be used for transaction commit records, eliminating the disc latency. NVM has been proposed as a component of file systems on several occasions, but as far as I know, no one has tried it because of the danger that when the system crashes, it may corrupt the NVM. Operating systems are assumed not to exibit fail-stop characteristics. So I am repeatedly deliberately crashing a toy op sys, and then checking to see whether it has corrupted its NVM (preliminary results suggest that it does happen, occasionally). I also have some ideas on how to try and prevent the op sys writing the NVM by accident using the MMU (the principle of NVM protection is not new, either). How reliable should I make the NVM subsystem? An obvious target is to better the disc which the NVM is buffering---hence my interest in hard read error rate. Thanks in advance for any info, Tim --- Tim Wilson | tdw@uk.ac.cam.cl | U of Cambridge Computer Lab, | ...!uunet!mcvax!ukc!cam-cl!tdw | New Museums Site, Pembroke Research | tdw%cl.cam.ac.uk@nsfnet-relay.ac.uk | St, CAMBRIDGE, UK, CB2 3QG assistant | +44 223 334624; Fax +44 223 334679 | (But not speaking for them)