Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!exodus!appserv!slovax.Eng.Sun.COM!lm From: lm@slovax.Eng.Sun.COM (Larry McVoy) Newsgroups: comp.protocols.nfs Subject: Re: NFS performance Message-ID: <625@appserv.Eng.Sun.COM> Date: 13 Jun 91 20:47:44 GMT References: <1991Jun13.164017.29944@Firewall.Nielsen.Com> Sender: news@appserv.Eng.Sun.COM Organization: Sun Microsystems, Mt. View, CA. Lines: 39 kdenning@genesis.Naitc.Com (Karl Denninger) writes: > >The reason for this has to do with NFS' stateless nature - it can't ACK > >the write until the data is safe; otherwise the server could crash and the > >client would lose data. > > The interesting thing is, there is little or no disk activity going on (from > a look at the wait I/O times and queues)..... on a Sun, on the other hand, > there IS a lot of disk activity during an NFS write operation. > > The MIPS systems I've used don't suffer from this problem. > > I don't quite understand the fanatacism with which people preach the NFS > stateless nature, O_SYNC and all that. The fact is that a crash of a > LOCAL Unix machine with the normal block buffering scheme can easily cause > the loss of data -- in this case, the write(2) call returned "ok" but it > really might not be "OK"! This is true whether the problem is later found > to be a bad disk sector, the machine panicing, or any one of a number of > other causes. Normal disk I/O on Unix machines is NOT reliable enough to > say "if you get a good return from write(), the data is safely on disk". NFS is stateless. The reason for this statelessness is so that a client does not need to do anything special when a server goes down. A dead server looks just like a slow server to a client. A client issues a write, the server ACKs the write. What does that ACK mean? It means that the client data is safe. The client kernel may throw away the data, the server has promised that the data can be retrieved. If the server ACKs the data before writing it to disk, there is a window during which the server can crash. The data is then lost. MIPS systems have an unsafe export option that allows you to turn off this constraint - big performance win, big safety lose. There are other ways to address this problem without breaking the semantics of NFS. One such way is to buffer the writes in NVRAM. --- Larry McVoy, Sun Microsystems (415) 336-7627 ...!sun!lm or lm@sun.com