Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!cs.uoregon.edu!ogicse!intelhf!ichips!iwarp.intel.com!gargoyle!chinet!les From: les@chinet.chi.il.us (Leslie Mikesell) Newsgroups: comp.mail.misc Subject: Re: Will ELM ever use lockf()? Message-ID: <1991Apr09.160512.1300@chinet.chi.il.us> Date: 9 Apr 91 16:05:12 GMT References: <27FA24B0.5244@tct.com> <2800BA22.1CAE@tct.com> Organization: Chinet - Chicago Public Access UNIX Lines: 40 In article <2800BA22.1CAE@tct.com> chip@tct.com (Chip Salzenberg) writes: >His patches for Smail, however, are *far* more than is required. >Smail 3 already has a knob that controls mailbox locking specifically. >In conf/EDITME, and in conf/os/, do *not* set >FLOCK_MAILBOX. That's it. (See the comments in conf/os/sys5* for >further description of of FLOCK_MAILBOX.) You should be aware though that the smail3 mailbox locking code is not particularly robust when using lockfiles. The usual problem of removing a different lockfile than you tested is made fairly likely in cases where the file being tested actually belonged to an exiting smail. Try setting smail's delivery mode to background and arrange to have several messages delivered while you hit the '$' in elm to re-sync, and you will have a pretty good change of losing messages. I tried to fix this by stat'ing the lockfile and refusing to delete it unless it is fairly old, but I'm still not sure it is perfect. Failure scenario: Process A owns lockfile - Processes B & C are contending for one. B reads A's PID from lockfile. A finishes, removes lockfile and exits. B sends signal 0 to A's PID, notes that process is gone. C notes that no lockfile is present and creates one. B removes lockfile (now belonging to C) and creates one. At this point both B and C think they have exclusive access to the mailbox. Note that the trick of sleep()ing a few seconds after deleting a stale file does not help at all in this scenario, since the exiting program deleted it's own file. I recently saw HDB uucico's debug output say something like: mlock tty64 succeeded failed to lock device tty64 indicating that the lockfile code had in fact failed to keep two processes from accessing the same device concurrently, but the kernel lock succeeded. Thus anyone who thinks the HDB lockfile locking code is foolproof is mistaken. Les Mikesell les@chinet.chi.il.us