Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!swrinde!cs.utexas.edu!sun-barr!apple!veritas!tron From: tron@Veritas.COM (Ronald S. Karr) Newsgroups: comp.mail.misc Subject: Re: Will ELM ever use lockf()? Message-ID: <1991Apr15.085531.14547@Veritas.COM> Date: 15 Apr 91 08:55:31 GMT References: <1991Apr09.160512.1300@chinet.chi.il.us> <1991Apr11.024849.29924@Veritas.COM> <1991Apr11.181909.13503@chinet.chi.il.us> Organization: VERITAS Software Lines: 79 In article <1991Apr11.181909.13503@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes: >In article <1991Apr11.024849.29924@Veritas.COM> tron@Veritas.COM (Ronald S. Karr) writes: >>However, both smail3 and elm create the files using the O_EXCL flag, >>indicating that the OS should ensure that only one of the two creates >>should succeed. > >No, look at the scenario again. The real problem is that the process >that decides a lockfile is stale has no way to tell that the file >that it unlink()'s is the same one that it tested. Oops. Sorry, although I carefully checked through smail3's and elm's locking code, I wasn't as careful with the message. You are correct that there is a three-process hand-shaking problem with the locking protocol. In general, using true inode-based locking done atomically by the operating system, with automatic lock releases on close or process kills, is a substantially better solution. Barring that, locking schemes using the link() system call generally work better than conventions using just file creation and removal. >Personally, I think the "right" solution is to deliver new mail into >individual files per message with a naming convention so incomplete >temporary files could be ignored by the mail reader. Every modern OS has file locking capabilities which are entirely sufficient for mail. They are just unused for mail in UNIX OS's, except in 4.3BSD and its derivatives. Also, I believe that NFS lacks a true atomic exclusive create, so NFS (but not RFS) has a window of vulnerability that will break many (though not all) message file naming conventions. You are correct that a directory would solve the problem of partial mail delivery before a crash. The UNIX file model, with no layering of conventions (e.g., transactions), is only reliable when files are modified through a copy to a new file followed by a rename. Depending upon your file system and UNIX, even this may not be completely reliable. > There would be >several other advantages as well: > You could require the destination directory to already exist, which > would allow you to detect an unmounted NFS/RFS mount point and defer > delivery instead of writing in a directory that will become hidden > when the net comes back up. There is no difference between directories and single files here, since a mailer can just as easily check for existence of a file as existence of a directory. It is just a matter of coming up with the agreed upon conventions to accomplish this task. > You could deliver to multiple recipients on the same machine with > a simple link. This would make mailing lists about as cheap as > a newsgroup and easier to maintain since the disk space would > automatically be released when the last recipient's copy is deleted. As with news, the relative efficiency of this solution is a function of the average message size, the average number of recipients and the overhead required for a file. When I used mh some time ago, and stored all of my mail in single files, I found the overhead to be far too great to consider the solution viable. You have to at least migrate messages into groups stored in single files; otherwise, searches become too slow, and the storage efficiency gets too low. >For the scenario I mentioned, you need at least three processes, so >using a delivery mode where the foreground process just queues the >file and a single daemon process handles all deliveries should make >it safe where O_EXCL works. But, since the removal of lockfile is >never safe I think it is worth testing its age anyway. Unless something >has gone drastically wrong you should never need to remove another >process' lockfile. You are correct. The smail3 locking strategy is probably broken in this respect, and should introduce a wait based on file modification time. PS: I suppose I should stop reading news about now and start working on my taxes. -- tron |-<=>-| ARPAnet: veritas!tron@apple.com tron@veritas.com UUCPnet: {amdahl,apple,pyramid}!veritas!tron