Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!ut-sally!husc6!uwvax!rutgers!sri-spam!sri-unix!hplabs!pyramid!prls!philabs!mcnc!duke!adiron!tish From: tish@adiron.UUCP (Tish Todd) Newsgroups: net.unix-wizards Subject: vaxingres bugs which can cause database corruption Message-ID: <356@adiron.UUCP> Date: Wed, 15-Oct-86 15:35:58 EDT Article-I.D.: adiron.356 Posted: Wed Oct 15 15:35:58 1986 Date-Received: Sat, 18-Oct-86 23:40:10 EDT Organization: PAR Technology, New Hartford, NY Lines: 73 Keywords: ingres Index: .../ingres/source/iutil/markopen.c Description: A bugs exists in .../ingres/source/iutil/markopen.c which can cause serious database corruption. This routine is used in building vaxingres, which is the main database manipulation routine. The bug exhibits itself under Ultrix, but not under the 4.2 BSD. The reason for this will become apparent further down in this explanation. When vaxingres starts up, it does the following (among other things): 1. Attempts to open a socket and connect to the lock_driver. If the lock_driver is not active, there is no further problem. (Note that the 'lock_driver' is the ingres concurrency daemon.) 2. Uses fstat to determine what file descriptors are currently in use. It presumes that at this point, only essential things, such as pipes and such, are open. It uses the information from fstat to build a mask of which descriptors should always remain open. Everything else is considered throwaway at certain times. See next step. The offending code (at about line 36 in markopen.c): for (i = 0; i < NOFILE; i++) { if (fstat(i, &sbuf) >= 0) *ovect |= 1 << i; } 3. Using the mask built in step 2, it calls closeall(), which is a dumb, uncaring brute which merely closes *all* files not protected by the mask built in step 2. closeall is executed at various points during vaxingres' lifetime, presumably after any database operation has been completed (my guess). The fatal bug appears in step 2, although the symptom shows up later. fstat does not recognize file descriptors which are open for sockets. The operation is not supported by Ultrix. fstat returns with negative number and errno set to EOPNOTSUPP. The code which builds the mask did not check errno - it only checked the return from fstat. As a result, the file mask built in markopen() indicated that the file descriptor for the lock_driver is not open. Thus, when closeall is invoked, it closes the socket connection to the lock_driver. However, because closeall - and by extension, vaxingres - has no smarts, vaxingres still assumes that the socket is alive and well. Now, what happens is that vaxingres is told to open a database file. The open call just happens to use the descriptor previously used for the lock_driver socket. vaxingres does its thing with the database file ok, but when it writes to the lock_driver, whammo, the information goes into the data file instead. Corruption city. Note that the reason that vaxingres does not exhibit the problem under the 4.2 BSD is because (at least in our BSD source code) fstat returns a 0 for this situation. Thus the above check is satisfied and the descriptor is assumed to be in use. Fix: The following change to markopen.c cures the problem (note that one could also check for errno != EOPNOTSUPP): for (i = 0; i < NOFILE; i++) { if (fstat(i, &sbuf) >= 0) *ovect |= 1 << i; /* else if not EBADF - descriptor *might* be in use */ else if (errno != EBADF) *ovect |= 1 << i; }