Xref: utzoo comp.os.msdos.programmer:1297 comp.protocols.nfs:1318 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!execu!sequoia!chinacat!uudell!bigtex!texsun!newstop!jaytee!hinode!geoff From: geoff@hinode.East.Sun.COM (Geoff Arnold @ Sun BOS - R.H. coast near the top) Newsgroups: comp.os.msdos.programmer,comp.protocols.nfs Subject: Re: Sharing Violation - MSDOS3.3 and PC-NFS Message-ID: <2802@jaytee.East.Sun.COM> Date: 3 Oct 90 12:47:53 GMT References: <1039@massey.ac.nz> Sender: news@jaytee.East.Sun.COM Reply-To: geoff@east.sun.com (Geoff Arnold @ Sun BOS - R.H. coast near the top) Followup-To: comp.os.msdos.programmer Organization: Sun Microsystems PC-NFS Engineering Lines: 55 Quoth GEustace@massey.ac.nz (Glen Eustace) (in <1039@massey.ac.nz>): #[A series of questions about sharing failures] When you enable locking and sharing services by mounting a file system with /MS (or the equivalent in NFSCONF), PC-NFS calls the portmapper on the server to obtain the UDP port for the Network Lock Manager (NLM), squirrels this away in the mount table and sets an internal "MS" flag on the drive. The effect of MS is twofold. First (and most obviously) DOS Lock and Unlock (Int21H 5C00H and 5C01H) calls are intercepted and translated into the equivalent RPCs to the NLM. Secondly, all file opens and closes become subject to file sharing verification, as described in the DOS Technical Reference for the Open (Int21H 3DH) call. When a file is opened, an NLM_SHARE call is made to the NLM to check that it is OK to open the file in the requested mode for the requested use. And obviously, when the file is closed an NLM_UNSHARE call is made to tell the NLM that the file is no longer in use. Thus even if you don't think "any locking is going on", every file open and close is causing at least one RPC call to the NLM on the server. So far, so good. What happens when things break down? Unlike the NFS server, the NLM is a regular user level process, and on a busy server it is possible for the process to get swapped out and not respond for an extended period. Obviously this is more likely on a smaller or slower server. There is another potential problem. Suppose the server is rebooted, or even that the NLM is killed and restarted. Any locks/shares that it was holding on behalf of its clients are lost. In Unix there is a mechanism to re-establish locks, based on the status monitor. (For more details, see the Usenix paper by Jo-Mei Chang from ~1986.) Unfortunately this is too heavyweight a scheme to shoehorn into a PC. And even if there were no open files, and thus no outstanding locks or shares to be recovered, the UDP port in the mount table is going to be incorrect after the NLM is restarted for any reason. In PC-NFS 2.0, 3.0 and 3.0.1 any server failure - real or perceived - is treated as fatal to the current application if it touches the affected drive. This is prudent; it is also the way in which most other PC networks handle things. However we also mark the drive as FAILED (which is where the "?" in the "NET USE" display comes from) and disallow any further access to the drive until it has been remounted. Any attempted access will provoke the critical error message you describe. This is actually overkill, and in the next release of PC-NFS we'll fix this and call the portmapper to reacquire the NLM port. This means that if you have a drive mounted /MS and someone reboots the server, the next time you touch a file on the drive PC-NFS will try the old port, time out (which takes a few seconds) and then get the new port and proceed. Geoff -- Geoff Arnold, PC-NFS architect, Sun Microsystems. (geoff@East.Sun.COM) -- *** "Now is no time to speculate or hypothecate, but rather a time *** *** for action, or at least not a time to rule it out, though not *** *** necessarily a time to rule it in, either." - George Bush ***