Path: utzoo!attcan!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!exodus!terra.Eng.Sun.COM!brent From: brent@terra.Eng.Sun.COM (Brent Callaghan) Newsgroups: comp.protocols.nfs Subject: Re: Buffering in biod and nfsd. Message-ID: <4646@exodus.Eng.Sun.COM> Date: 17 Dec 90 22:09:15 GMT References: <1990Dec15.071319.16674@objy.com> Sender: news@exodus.Eng.Sun.COM Lines: 108 In article <1990Dec15.071319.16674@objy.com>, peter@prefect.Berkeley.EDU (Peter Moore) writes: > > I have some questions on cacheing and synchronization that I hope some > of you NFS implementors can answer. > > biod: > I have seen biod described as read-ahead and write-behind. > Which implies it both caches writes (in the sense that it > returns before the write is actually done) and it actually > reads more blocks than requested, in anticipation of the > additional blocks being used in a future calls. So: I can speak for the SunOS implementation: > a) Does it return before the actual NFS-write is complete? You don't need biod's for this to be true. Writes go into mapped file pages and are cached there. The cached data gets flushed only if a write crosses a page boundary or if the file is closed, or if the page daemon flushes it. At flush time the page is scheduled for a biod. If no biod's are available (perhaps they're all busy) then the flush is done in the process context (becomes synchronous on the client). In general, yes thanks to client caching writes will return before the data is written to the server. > b) Again, if a) is true, is there any way for the user to find > that the write failed? Yes, failed writes will be recorded with the file's rnode. The error can be tested for on subsequent writes or the file close. > c) If so, is there any way a user process can assure that a > particular block or all of its writes in have been > written yet? In particular does fsync work or is it (as I > have heard) a no-op? The biod's are invoked only for asynchronous IO. An fsync implies a mandatory synchronous IO and indeed that's how it's implemented in SunOS. An fsync will not return until the file changes are flushed to the servers disk. > d) Does biod actually read-ahead? Yes. > e) If so, how does it decide when to flush the cached data and > actually re-read the data? When cached attributes timeout they are refreshed on demand. If the new attributes indicate that the file has changed then the file pages are invalidated. > f) Is there any way a user process can affect that cacheing? Not really - except to use such heavy handed techniques as mount with the "noac" option, or use fsync to force updates to the server. > nfsd: > a) Does the nfsd the write back directly do disk, or maintain > a personal cache? (My understanding is that modulo > WRITECACHE, it definitely does not, in fact it even flushes > the OS cache). Yup, the nfsd is required to do synchronous writes to stable storage (disk). The nfsd doesn't maintain a write cache. > b) If (heaven forbid and presto-serve not installed) it does cache > writes, can this be flushed under user control? NA. > c) Does it do any read-ahead/read-cacheing (I would certainly > hope it wouldn't) Not explicitly. Only if the underlying VFS does it. > d) If (again, heaven forbid) it does do read-cacheing, can that > be flushed under user control? No. BTW: read caching on the server is fine so long as it's write-through to stable storage. > My guess is that nfsd doesn't do any cacheing (except for the implicit > cacheing of the OS buffer pool), Right. >biod does write-behind and read ahead, > but there is no way to control any of it at the user process level. > But I hope this is not true, since it make NFS mounted file systems > pure poison for any one doing distributed database work. You've got it in a nutshell. -- Made in New Zealand --> Brent Callaghan @ Sun Microsystems Email: brent@Eng.Sun.COM phone: (415) 336 1051