Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!henry From: henry@utzoo.UUCP (Henry Spencer) Newsgroups: comp.arch Subject: Re: Double-bit errors and ECC memory Message-ID: <8724@utzoo.UUCP> Date: Wed, 7-Oct-87 14:01:13 EDT Article-I.D.: utzoo.8724 Posted: Wed Oct 7 14:01:13 1987 Date-Received: Wed, 7-Oct-87 14:01:13 EDT References: <686@obiwan.UUCP>, <8637@utzoo.UUCP> <8638@utzoo.UUCP>, <870@alaska.cray.com> Organization: U of Toronto Zoology Lines: 18 > ... However, if you mean actually decoding the syndrome > bits to determine which bit has been flipped, this seems impractical. What > happens if the error is in the code that corrects errors? What happens if the code that runs your software-managed TLB gets a TLB miss? What happens if your pager gets a page fault? The answer is the same: you have to make sure it doesn't. Either the software has to be very careful (which is okay for things like paging but not for hardware issues like error correction), or else the crucial bits of software have to get special help. Include a small amount of high-reliability static RAM to hold the memory-error handler. That is what Cheriton et al did for the cache handler in their virtual-cache-MMUless design: the hardware has no idea how to do the virtual->real mapping for a cache miss, so the software that does the mapping MUST NOT cache miss, so it sits in a special bit of supervisor-only memory that is neither mapped nor cached. -- PS/2: Yesterday's hardware today. | Henry Spencer @ U of Toronto Zoology OS/2: Yesterday's software tomorrow. | {allegra,ihnp4,decvax,utai}!utzoo!henry