Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!agate!eos!ames!oliveb!sun!commuter!beepy From: beepy%commuter@Sun.COM (Brian Pawlowski) Newsgroups: comp.protocols.nfs Subject: Re: hard vs. soft mounts on Suns and Pyramids Summary: mount option explanation Message-ID: <102860@sun.Eng.Sun.COM> Date: 4 May 89 05:27:59 GMT References: <15766@bellcore.bellcore.com> <840@mtxinu.UUCP> <290@ai.cs.utexas.edu> Sender: news@sun.Eng.Sun.COM Lines: 157 Carl Smith and Mark Stein have given me some notes on what intr, hard and soft mean to an NFS client. This doesn't answer all your questions, but does put them in context. This information is pretty accurate for UNIX client implementations of NFS derived from the NFS/ONC reference port. The analogies are entirely my own. An NFS client will timeout a request if the server does not respond in some (user specifiable) period. This is coupled with a retrans count and a backoff mechanism on the timeouts to deal with slow servers. A server must be able to deal with multiple, duplicate requests arising from retries as a result of his tardy responses. Mark Stein gave an enlightening talk on the timeout strategy for a UNIX client during the MVS/NFS server development: The mount operation allows specifying a timeout value - timeo, and a retry value - retrans - the number of retransmissions of the NFS operation, and whether the mount is hard or soft. The soft option returns an error if the server does not respond (as described below), whereas hard says continue the retry request until the server responds. The intr option may be added to modify the the behaviour of hard mounts, and allows keyboard interrupts to stop the retransmissions. These are described in the following pictures. A normal NFS request (which is successful first shot) is processed as follows: Client The Ether Server ------ --------- ------ NFS ---> ---------> | request | <------ | server | Increasing responds | Time <--------- | Response | Client | Continues | V The following picture shows timeouts (timeo value is entered in tenths of seconds) up to a retry value (4) against an unresponsive server: Client The Ether Server ------ --------- ------ NFS ---> - ---------> | | request | timeo = 7 | | | NFS ---> - ---------> | | request | timeo = 14 | | | Increasing NFS ---> - ---------> | Time | request | timeo = 28 | | | NFS ---> - ---------> | | request | timeo = 56 | retrans = 4 | | - ---------> | request | <------- | Timeout returned | to caller IF SOFT! - A Major Timeout | else if HARD or INTR, double | timeo and reenter loop. | V A TIMEOUT is registered on the client from NFS only after the timeo time has elapsed for the specified number of retrans retransmission specified. The initial timeo value itself may be dependent on the type of operation (write vs. getattr vs. read) in a given NFS client implementation. On each retransmission, the timeo value is doubled. If a server is mounted soft, the timeout is returned to the calling procedure or program. If a server is mounted hard, NFS will backoff (double the current timeo value on each major timeout to some maximum) with a new, longer timeo value and attempt again for the specified retrans count (with the new current timeo value doubled at each retransmission). The initial default timeo on entry to each retransmission cycle has a maximum value of 30 seconds. The maximum timeo in retransmission sequence has a maximum value of 60 seconds. timeo is specified in tenths of seconds If the server is mounted intr, this is the same as hard, except that on major timeouts (current, aged timeo value times retrans count with backoff) a software interrupt may force an error return of timeout to the calling procedure or program. In older implementations of NFS, an interrupt can only slip in on a major timeout, a request that has an aged timeo value with even a small retrans count can take a mighty long time indeed to respond when a server is mounted intr. Later implementations allow the interrupt to stop retransmissions much sooner. Sometimes a mount may seem uninterruptable, when in actuality the client may have backed off so the window for the interrupt to take effect is a long way off, in an older NFS implementation. Now on soft mounts garbling the data: this is entrirely application dependent. If applications check their errors on write()'s (mine do :-) then they will see the error and will most likely abnormally end. Most applications probably do not, so you get intermittent failures, some successes, and resulting garbled data. Now what I forget to do is to check the return values on close() - where you may see an asynchronous error from a previous write() call. This may be where you see writes returning OK - but if you check your close(), it will probably fail. Remember - the writes to the server a REALLY asynchronous to your application given the buffering inherent in UNIX (which exists between your application and NFS). The fact that the write returns OK to the application, and may later fail (soft mounted) is consistent with normal UNIX behaviour for say a failing disk - where the error is detected at some time after the write() returns OK to the application when the buffer is actually attempted to write to disk. That is why the safest bet, for critical data (such as the NFS files which represent your ROOT PARTITION for diskless clients), is hard mount the file system. If you like living on the ragged edge, specify the intr option on writable partitions - then you have the control as to whether or not you'll trash your file writes in process - with the same behaviour as if you've interrupted write()'s to a local hard disk. An analogy is that mounting hard with the intr option makes your server most resemble local hard disk for your applications. If you mount your active writable filesystems soft, you might consider taking up skyjumping for a hobby where you use randomly defective parachutes for that certain extra thrill. Some work is being done for future NFS releases which implements a dynamic retransmission algorithmn which would affect the above discussion. This is pretty valid for UNIX clients out there now. Brian Pawlowski Manager ONC Porting Brian Pawlowski Sun Microsystems, Portable Software Products