Path: utzoo!attcan!uunet!husc6!bloom-beacon!tut.cis.ohio-state.edu!ukma!rutgers!bellcore!tness7!ninja!sneaky!gordon From: gordon@sneaky.TANDY.COM (Gordon Burditt) Newsgroups: news.software.b Subject: Re: concurrent execution of rnews and inews Summary: Mandatory file locking + News 2.11.14 = deadlock Keywords: rnews hang xenix_2.1.3 News_B2.11.14 Message-ID: <2426@sneaky.TANDY.COM> Date: 13 Sep 88 21:05:39 GMT References: <284@corum.UUCP> Organization: Gordon Burditt Lines: 62 News 2.11.14 has a deadlock problem on systems like Xenix where the file locking is non-advisory (mandatory). Xenix uses the locking() call in place of lockf() (# define in a header file), and has LOCKF defined. The situation: "rnews -U" locks the lib/seq file while it is running, including the time while it is waiting for children to finish. "rnews -U" forks a "rnews -S -p " to run, then waits for it to finish. That rnews forks a child rnews to process individual articles, and a "compress -d" if the batch happened to be compressed. The child rnews may need to access the lib/seq file to post an article locally. In particular, this happens if the incoming article is an "ihave", and some of the articles in its list aren't present on the receiving system yet, so it needs to generate a "sendme" article locally. "rnews -U" is waiting for parent "rnews -S" while holding lib/seq locked. parent "rnews -S" is waiting for child "rnews -S". child "rnews -S" is trying to access lib/seq, but it is blocked until "rnews -U" lets go of its lock. DEADLOCK! Other process will stack up behind this deadlock, making a mess to clean up, or eventually running the system out of swap space, process slots, or open file table entries. Many systems may avoid ever seeing this by running expire at a time when articles do not come in, thus never giving rnews -U any work to do. The fix: This is a kludge. This change causes "rnews -U" to lock only the portion of the file after the first 512 bytes, instead of the whole file. Since lib/seq is not likely to require more than 511 digits of article id number for quite some time, even on Portal, this will prevent the locking from interfering with access to lib/seq, but it will still permit "rnews -U" to lock out another "rnews -U". The file descriptor is not used for anything but locking, so no repositioning of the file pointer after the locking is required. The fact that the lib/seq file is well under 512 bytes long doesn't bother the locking() call at all. *** inews.old Wed Aug 24 06:55:13 1988 --- inews.c Mon Sep 12 23:12:20 1988 *************** *** 1435,1440 **** --- 1435,1441 ---- xerror("opendir can't open .:%s", errmsg(errno)); #ifdef LOCKF LockFd = xfopen(SEQFILE, "r+w"); + lseek(fileno(LockFd), 512L, 0); if (lockf(fileno(LockFd), F_TLOCK, 0L) < 0) { if (errno != EAGAIN && errno != EACCES) #else /* !LOCKF */ Note that there are some additional "problems" associated with news used with mandatory locking. For example, if you try to fire up "rn" (or likely just about any newsreader) while expire is running, it will hang until expire finishes, because the active file is locked. (You can SIGINT out of it, though). I haven't decided whether this is a bug or a feature. Gordon L. Burditt ...!ninja!sneaky!gordon