Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!cs.utexas.edu!uunet!mcvax!ukc!acorn!john From: john@acorn.co.uk (John Bowler) Newsgroups: comp.windows.x Subject: Re: rgb database corruption Summary: rgb database corruption - a possible explanation Message-ID: <796@acorn.co.uk> Date: 16 Jun 89 12:41:45 GMT References: <8906142323.AA04774@expire.lcs.mit.edu> Organization: Acorn Computers Limited, Cambridge, UK Lines: 62 In article <8906142323.AA04774@expire.lcs.mit.edu>, rws@EXPO.LCS.MIT.EDU writes: > Here's an unofficial diff fragment to server/os/4.2bsd/osinit.c that might or > might not cause this problem to disappear. You can try it if you are being > pestered by this problem. You should probably ignore it if you aren't. > Your mileage may vary. > > [Patch - most omitted] > ! if (!(err = fopen (fname, "a+"))) > ! err = fopen ("/dev/null", "w"); > ! if (err && (fileno(err) != 2)) { > ! dup2 (fileno (err), 2); > ! fclose (err); > ! } This fixes one obvious problem, but this problem (connection of stderr to a file descriptor other than fd 2) is not the only possible cause of rgb database corruption. I have been running with appropriately fixed R2 code and still observed these symptoms. For my code to fail a subsequent open of /dev/null must also fail - I come to the conclusion that this must be happening (very rarely) on the systems I use - and I notice that the above code will still go wrong if the fopen ("/dev/null", "w") fails. For the database to be corrupted (given the normal installation mechanism) the server must be running as root and (at least) the open of /usr/adm/X?msgs or the subsequent dup2 must fail. Assuming the directory /usr/adm exists the only likely reason for failure on a bsd, or bsd-tahoe, system, is if the kernel file table fills up - which will tend to mean that all the opens fail together. I reckon the server should check both ``err'' and fileno(stderr) and, if either is wrong, it should give up. Of course, I'm biased - Acorns customers received X binaries on 50MByte discs (so no possibility of fitting the source on). If they manage to corrupt their rgb database they can do nothing about it short of a going to the level 0 backup which they did, of course, make as soon as they got the system... The following code fragment (**NOT guaranteed - bsd specific - caveat emptor**) should work. A better fix, for those with access to the ndbm package, is to hack oscolor.c and osinit.c to open the database RD_ONLY (if only it was possible to cause dbm to open the database read-only - but even making the files read-only doesn't help under bsd; the super user can always write to them). /* * This is done in this nasty way to ensure that the correct file descriptors end * up connected to the correct place. */ /* Zap stdin and stdout */ if (freopen("/dev/null", "r", stdin) == NULL) _exit(2); if (freopen("/dev/null", "w", stdout) == NULL) _exit(2); /* See if stderr is a reasonable stream, if it is assume it is ok */ if (fcntl(2, F_GETFD, 0) == (-1)) { char fname[MAXPATHLEN+1]; sprintf (fname, ADMPATH, display); if (freopen(fname, "a+", stderr) == NULL && freopen("/dev/tty", "a+", stderr) == NULL && freopen("/dev/console", "a+", stderr) == NULL || fileno(stderr) != 2) /* Could output error message here */ _exit(3);