Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!convex!convex.convex.com!thurlow From: thurlow@convex.com (Robert Thurlow) Newsgroups: comp.protocols.nfs Subject: Re: NFS writes and fsync(). Message-ID: Date: 12 Oct 90 16:15:35 GMT References: <1990Oct9.152612@objy.objy.com> Sender: usenet@convex.com Lines: 52 In <1990Oct9.152612@objy.objy.com> peter@objy.objy.com (Peter Moore) writes: >WHY ON EARTH DOES NFS REQUIRE THE FSYNC ON WRITES? Without that >requirement, we could the effect of this cache board by just not >calling fsync(). No, you couldn't. The cache board for PCs that I know about is a nice unit that essentially promises you the data won't go away and keeps it in battery-backed memory to ensure it. That's important, since once the write request is acknowledged, the client will not try the write again, and may discard its copy of the data. You can easily lose data when the server goes down without the server syncing it. Usually, too, what waits for the acknowledgement is a block I/O daemon (biod) that will handle your async writes for you; your process has to wait for all I/O only when it does an fsync() or a close(), though aggregate throughput is reduced. I think most people would agree that the default behaviour should be to make writes reliable, since that provides the semantics of a local filesystem. You are more free to buy extra throughput by upgrading the server disk or CPU that you are to buy more reliability. That said, I'll add that we do provide an export option to allow you to tell the server to acknowledge the write request immediately upon receipt, and spool the request to its local I/O subsystem. It can help performance a good bit if you don't mind the risks. It's great for filesystems all clients mount with -soft; their processes will be gone after a server reboot, anyway. >Now whenever I see something ugly in NFS, it usually comes from the >stateless requirement. But the only state dependent reason I can see is: > Process P on machine A writes to machine B > machine B crashes before the write is synced to disk Stop right there. Your 'disk' has just lost data, period. Do you expect your local disk to ever do that? The effects could be very devastating, depending on what exactly cared about the data. Think of the havoc you could wreak on a database server. >But in real life, I have seen situations vaguely like this, and the >writing process gets a `stale NFS handle' error. So it seems that at >least the NFS implementations I have run into have that much state. ESTALE only happens when the server can't find anything matching the file handle on its disks, and usually happens when some other process did a creat() or an unlink(), or the server filesystems got mounted in a different order. I don't see the connection here. Hope that helped, Rob T -- Rob Thurlow, thurlow@convex.com or thurlow%convex.com@uxc.cso.uiuc.edu ---------------------------------------------------------------------- "I was so much older then, I'm younger than that now."