Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uwm.edu!spool.mu.edu!mips!zaphod.mps.ohio-state.edu!think.com!snorkelwacker.mit.edu!thunder.mcrcim.mcgill.edu!mouse From: mouse@thunder.mcrcim.mcgill.edu (der Mouse) Newsgroups: comp.unix.wizards Subject: Re: Another reason I hate NFS: Silent data loss! Message-ID: <1991Jun22.133334.26320@thunder.mcrcim.mcgill.edu> Date: 22 Jun 91 13:33:34 GMT References: <27226@adm.brl.mil> <16703.Jun1903.07.1091@kramden.acf.nyu.edu> Organization: McGill Research Centre for Intelligent Machines Lines: 36 In article , truesdel@nas.nasa.gov (David A. Truesdell) writes: > brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >> In article <27226@adm.brl.mil> mike@BRL.MIL ( Mike Muuss) writes: >>> NFS is designed as a reliable protocol. I have pounded more than >>> 250 NFS requests/sec against a fileserver, and no data loss. >>> Things you should check are the number of retransmit's you >>> authorized in /etc/fstab, [...] >> If the number of retransmits runs out, the writing process >> ``should'' get an error. Otherwise the implementation is >> (obviously) buggy. > Why ``should'' it? Your writes probably put their data into the > buffer cache just fine, it's the subsequent flushing of the buffer > cache that failed. And guess what? The write had probably already > returned by then. Consider a real disk. What happens if a real disk doesn't respond when the kernel writes a buffer from the buffer cache to it? Right. The kernel panics. So a case could be made that if the number of retransmits runs out (where a hard mount could be considered as specifying infinite retransmission), the kernel should panic. Unfortunately, fileservers die much more often than disks do. The current behavior is a compromise between preserving disk semantics and practicality. (No, I don't particularly like NFS either. For us, unfortunately, it is pretty much the only game in town.) der Mouse old: mcgill-vision!mouse new: mouse@larry.mcrcim.mcgill.edu