Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!dali.cs.montana.edu!milton!uw-beaver!ubc-cs!alberta!myrias!dab From: dab@myrias.com (Danny Boulet) Newsgroups: comp.arch Subject: Re: Workstation Data Integrity Message-ID: Date: 29 Aug 90 20:21:56 GMT References: <1990Aug3.204358.330@portia.Stanford.EDU> <40694@mips.mips.COM> <2399@crdos1.crd.ge.COM> <1990Aug10.171744.9639@zoo.toronto.edu> <2421@crdos1.crd.ge.COM> <1990Aug18.210132.25203@sco.COM> <2434@crdos1.crd.ge.COM> <6797.26d6edce@vax1.tcd.ie> <2469@crdos1.cr Organization: Myrias Research Corporation Lines: 42 In article <3294@awdprime.UUCP> tif@doorstop.austin.ibm.com (Paul Chamberlain/32767) writes: >In article <2469@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes: >>No answer is better than a wrong answer. What would anyone >>bother to run on a computer which is so valueless that they don't care >>if they get a right answer, just so that you get an answer? > >I'm sorry, but I have to go into reality mode here. I can understand >if you were running a simulation on the space shuttle you'd rather >get no answer than a wrong answer. But let's say you were doing something >more typical, like ... oh ... replying to a long article in news. You've >been typing and researching for an hour now. I ask you this: would you >rather I just blow away that entire article and crash your machine or change >a single random character? Gee. That depends. Consider the characters "1.23456e12". If the random change hits the '6' and turns the characters into "1.23452e12" then I probably don't mind. If the random change hits the exponent field and I get the characters "1.23456e92" (a one bit change) then I probably mind quite a bit. Similar effects occur with binary data (hitting the low order bit of a floating point number won't matter much but watch what happens when a high order bit or sign or exponent bit gets hit). I'd prefer that the question was: "would you prefer that I crash the machine and force you to use the backed up file produced by your editor or silently produce a wrong answer?". I'm very much in favour of answers that I trust. I know that there are an awful lot of ways that a computer can produce wrong answers. That is no excuse for failing to catch the ones that it is practical to catch. Adding an extra bit to each byte (or whatever) seems like a small price to pay for a bit more confidence in the results. Also, given the reliability of current memory and such, crashes due to parity errors would probably be a lot less frequent than crashes due to other random events (i.e. adding this feature probably wouldn't do much harm to the MTBF numbers for the system). One final note: a lot of small computers are used for business applications like payroll, accounting, inventory and such. This may not be as flashy as simulating the space shuttle but silent failures in these applications can be pretty devastating to the business. Unfortunately, the users of such systems are probably the least likely to appreciate the value of knowing that the computer detected an error and aborted rather than giving wrong answers.