Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10 Apollo 11/21/85; site apollo.uucp Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!decvax!wanginst!apollo!mishkin From: mishkin@apollo.uucp (Nathaniel Mishkin) Newsgroups: net.unix-wizards Subject: Re: File locking on networks Message-ID: <2b598755.3166@apollo.uucp> Date: Wed, 15-Jan-86 10:29:22 EST Article-I.D.: apollo.2b598755.3166 Posted: Wed Jan 15 10:29:22 1986 Date-Received: Fri, 17-Jan-86 03:14:47 EST References: <910@brl-tgr.ARPA> <2adcce15.1de6@apollo.uucp> <1011@brl-tgr.ARPA> <1106@brl-tgr.ARPA> <419@hoptoad.uucp> <1507@brl-tgr.ARPA> Reply-To: mishkin@apollo.UUCP (Nathaniel Mishkin) Organization: Apollo Computer Inc., Chelmsford MA Lines: 46 Summary: Here's how we (Apollo) deal with locking. It's not perfect, but in practice (e.g. on our internetwork of 1000+ workstations on 7 networks) it works quite well: There are two nodes associated with every lock: the home node (i.e. the node the file lives on), and the locking node (i.e. the node that the process requesting the lock is running on). The existence of a lock is registered on both the home node and the locking node. However, the information on the home node is the one that really matters to the world, since every lock request for files on that node come to it, not any other locking nodes. (Obviously, sometimes the home node and the locking node can be identical, but this case is trivial, so I won't consider it.) Locks are held in volatile storage (i.e. virtual memory, not disk) and hence evaporate when a node goes down. If a node is explicitly shut down, many locks will be unlocked by virtue of processes holding locks being killed. Of any remaining locks, those held BY the node shutting down, are force-unlocked. Then the node broadcasts an "unlock all" message to all other nodes. Recipients of such a message force-unlock all locks held BY the recipient ON files on the node that sent the message. When a node boots, it broadcasts an "unlock all" message too. When a node N locks a remote file, it sends a message to the remote (home) node asking if it is OK to lock. If the home node says "no, because process P on node M has the file locked", N sends a message to M asking if he really has that file locked. If N says he doesn't have the file locked, N tells the home node to force-unlock the file, and then N tries to lock the file again. This strategy is helpful in case a node has missed an "unlock all" message. (Since broadcasts aren't propagated across bridges between networks, this can happen.) Note that if node M is unreachable, this scheme doesn't help. So what do we do if you run into a "bad" case -- internet partition or crashed node that hasn't been rebooted? Well, someone will try to open a file (and try to get a lock since all opens must be accompanied by locks) but will get the error "object is in use". We supply tools for USERS to see who (what node and process) has the lock. The user can then decide whether it's safe to forcibly break the lock (there's another tool to do that). It's not a perfect scheme, but let's remember, considering people run on Unix systems all the time with NO locking (even in the local case), it's clearly a step up. -- Nat Mishkin Apollo Computer apollo!mishkin