Path: utzoo!utgpu!news-server.csri.toronto.edu!clyde.concordia.ca!uunet!cs.utexas.edu!swrinde!zaphod.mps.ohio-state.edu!brutus.cs.uiuc.edu!ux1.cso.uiuc.edu!ux1.cso.uiuc.edu!aglew From: aglew@dwarfs.csg.uiuc.edu (Andy Glew) Newsgroups: comp.arch Subject: ECC Message-ID: Date: 7 May 90 16:25:51 GMT References: <402@dg.dg.com> <406@dg.dg.com> <38303@mips.mips.COM> <10213@batcomputer.tn.cornell.edu> <38550@mips.mips.COM> Sender: usenet@ux1.cso.uiuc.edu (News) Distribution: na Organization: University of Illinois, Computer Systems Group Lines: 34 In-Reply-To: mash@mips.COM's message of 7 May 90 03:39:02 GMT ECC on memory has several performance costs: (1) the time to compute the ECC on a read - can it be overlapped with the processor, or can you handle a "the memory location you read several cycles ago, that you have written to all of the registers, was bad" trap with a significant delay? (2) the hardware to do the ECC computation - does it add load delays, or could it be used for something else? (3) partial writes become (non-atomic) read-modify-write operations... Note that (3) isn't necessary if you already have all of the old cache line data in your cache - well, you do an RMW in cache, and recompute the ECCs, but at least you don't have to do the RMW across the bus. This is equivalent to a write-allocate policy, which is implicit in many of the snoopy policies that obtain a cache line exclusively on the first write. But sometimes first-write exclusivity isn't desirable, and/or you want to be able to do partial writes. Q: does anyone know of "sometimes" ECC systems? Ie. memory systems where there is a valid bit associated with the ECC, so that a partial write where the ECC cannot be totally computed just clears the valid bit (perhaps downgrading to some simpler form of parity check, which may be embedded within the ECC). The next time that the whole cache line is written or read, then a new full-line width ECC could be computed. A "scrubber" process in the OS could go through all of memory updating the full ECC to valid during idle time. Obvious counter argument is that if you can afford not to have ECC at all times, then you can afford not to have ECC at all. Or, if the frequency of partial writes is low enough that the (un)reliability of not having ECC on those memory locations is acceptable, then the performance cost of doing RMWs at those times is probably acceptable as well. I'm not proposing or evaluating, just wondering out loud if it has already been done somewhere. -- Andy Glew, aglew@uiuc.edu