Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!acd4!mjb From: mjb@acd4.UUCP ( Mike Bryan ) Newsgroups: comp.protocols.nfs Subject: Re: NFS client has out-of-date files Message-ID: <1989Oct6.203732.12847@acd4.UUCP> Date: 6 Oct 89 20:37:32 GMT References: <17787@bellcore.bellcore.com> <1967@convex.UUCP> Reply-To: mjb@acd4.UUCP ( Mike Bryan ) Distribution: na Organization: Applied Computing Devices, Inc., Terre Haute, IN Lines: 96 In article <1967@convex.UUCP> thurlow@convex.com (Robert Thurlow) writes: >tr@bellcore.com (tom reingold) writes: >>A user was working on two Suns simultaneously. He was editing one file >>on one Sun, and reading the file using LaTeX on the other. The file >>was NFS mounted on both Suns. It physically resided on the NFS server, >>a Pyramid. One of the Suns had an out-of-date copy! It was a minute >>old. > >We've had this; what seems to be a common cause for it is that the >time is not synchronized between the updating client and the server, >so the file attributes don't get through the server and show up on >disk until something changes. Well, here goes an attempt to describe what's happening. We had this problem with our systems, and because of it have had to abandon using NFS for our customer systems (at least for now). An NFS client maintains a cache of accessed files. This cache includes file attributes (such as modification time and ownership/protection). If it has a file's data locally, and the attributes were "recently" read, it will not try to access the server. It's the definition of "recent" that causes the problems. The client will periodically re-read the file attributes from the server. If it determines that the file has been modified, it will decide the local data is invalid, and request the file data from the server. The problem you are seeing is that the client can take too long to realize the file has changed. (The following might be a bit off technically, it's been almost a year since I investigated all of this... If so, I apologize. However, I'm certain any errors are minimal, and it should get the gist across.) Normally, the file attributes are checked every 3 seconds. However, if the system times are skewed, it can take longer. (I don't remember exactly which times are being compared, but some it has something to do with the last time the attributes were read and the times within those attributes.) In Ultrix 2.3, at least, these "re-check" times are controlled by the following four kernel variables: Name Value (in seconds) ------------------ ------------------ nfsac_regtimeo_min 3 nfsac_regtimeo_max 60 nfsac_dirtimeo_min 30 nfsac_dirtimeo_max 60 The "*_min" values determine how often it decides to try to look at the file attributes. These values don't hold if the time is skewed, however. The "*_max" values determine how often they are re-read NO MATTER WHAT. If the times are skewed your data should be no more than 60 seconds out of date (and this *is* what you reported seeing). The above values are in two sets: "*dir*" applies to directory files, and "*reg*" applies to regular files. I don't know if these same names are used in other O/S's, but I'd bet Sun is at least close, since Ultrix changed very little of Sun NFS for 2.2/2.3. What does all this mean? Well, you can try changing these kernel values. We did, and saw the data-skew problem lessen as expected. However, you pay a performance penalty, since requests are more likely to access the server rather than use the cache. Also, even at "0", there is up to a one second delay, since the code apparently waits until the time difference is strictly greater than the given value. (Without source, I can't say for sure, however.) Admittedly, I did not try a "-1", but that might cause problems, especially if they are unsigned variables. (Hmm, infinite time/data skew. How lovely!) Also, you can supposedly remove all data skew by using the NFS lock daemons and applying a lock to the file in question. Since we were running 2.3 Ultrix at the time, and it did not have NFS locking, I haven't verified this, nor do I know the details. Maybe I'll check it out again since we are gearing up for Ultrix 3.0/3.1 support now. Note: All of the above deals with the case of keeping data synched between a client and its server. If you have multiple clients, and one client is reading what another client is writing, you have an additional delay added by the time for the data to propagate from the writing client to the server. This is controlled by the sync/update procedure, and can cause further delays of up to 30 seconds. (NFS *might* be a write-through cache, but I don't think so.) We at least had the writes occuring on the server, but we were unable to use NFS for this particular application even then, as we had to have the same synchronous read/write semantics as for local files. *Sigh*. Anyway, hope this helps anyone who has noticed the same problem. Normally, it should not cause serious problems, especially if you keep the system times synchronized. If you aren't expecting it though, it can be quite frustrating. -- Mike Bryan, Applied Computing Devices, 100 N Campus Dr, Terre Haute IN 47802 Phone: 812/232-6051 FAX: 812/231-5280 Home: 812/232-0815 UUCP: uunet!acd4!mjb ARPA: mjb%acd4@uunet.uu.net "Did you make mankind after we made you?" --- XTC, "Dear God"