Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site rtech.UUCP Path: utzoo!watmath!clyde!burl!ulysses!gamma!epsilon!zeta!sabre!bellcore!decvax!tektronix!hplabs!amdahl!rtech!jas From: jas@rtech.UUCP (Jim Shankland) Newsgroups: net.unix-wizards Subject: Re: Kernel mods and RTIngres Message-ID: <465@rtech.UUCP> Date: Tue, 4-Jun-85 20:46:00 EDT Article-I.D.: rtech.465 Posted: Tue Jun 4 20:46:00 1985 Date-Received: Sun, 9-Jun-85 02:44:22 EDT References: <410@wdl1.UUCP> <235@ucbcad.UUCP> <10712@brl-tgr.ARPA> Organization: Relational Technology, Alameda CA Lines: 83 I have been following with interest the recent discussion on concurrency managemenet in RTI's INGRES, particularly Doug Gwyn's increasingly acerbic broadsides. As someone who works at RTI and is familiar with the issues involved, I believe I am in a position to correct some of the misperceptions Doug has been sharing with the net. Briefly, Doug's arguments are: it is reprehensible of us to require our users to install a pseudo-device driver for INGRES concurrency control. We have no right to expect our users to make kernel mods in order to run INGRES, a mere applications program. Writing a user-level lock manager is straightforward, albeit different for different versions of UNIX. Furthermore, there are always the "flock" system call of 4.2bsd, and the "lockf" system call of the /usr/group standard. Finally, Doug says that INGRES does not even make use of the lock pseudo-device ("concurrent updates are not supported"), so that making the kernel mod is a pointless exercise. Doug clearly implies that we at RTI don't know what we're doing. His last word (so far) is: "I'm not an expert on database systems (yet), but I recognize poor software design when I see it." I'll dispose of the easiest complaint first by noting that it is simply incorrect that we do not make use of the lock pseudo-device. We support concurrent updates, as well as multi-statement transactions. We use the locking pseudo-device to ensure that all concurrent transaction executions are serializable, while still maximizing concurrent access to shared data -- precisely what concurrency management in a DBMS is all about. I'm certainly not wild about the idea of a lock device driver, and I don't think anyone else at RTI is, either. We have been and will continue to investigate alternatives. However, there are serious (not necessarliy disqualifying) disadvantages to all alternatives we have found so far. The "flock" system call, which locks an entire file at a time, is inadequate for our purposes; we need a finer granularity of locking than that. The "lockf" system call, which permits locking of an arbitrary, contiguous subsection of a file, is getting closer. But it is currently available on few, if any, of the systems on which we offer INGRES. Neither call provides more than two lock modes, does any deadlock detection, or any reasonable cleanup after a process exits abnormally (merely releasing the locks that process holds is NOT enough!). These are all serious shortcomings. Granted, we could make INGRES use these calls; but we would be offering a product with significantly reduced functionality compared with what we now offer. The sad fact is that UNIX offers grossly inadequate concurrency control for a real DBMS. Anyone who doubts this might do well to look at silly, old VAX/VMS; its lock manager puts UNIX to shame. That leaves us with the possibility of a user-level lock manager process. Such a process presumably receives lock request messages on some sort of named message channel, and sends response messages back. Named FIFO's and System V messages come to mind for System V, and sockets for 4.2bsd. The three major problems with this approach are: (1) how does the lock manager (asynchronously) find out about abnormal termination of a client?; (2) if every client requires its own communications channel, how does the lock manager support more than 20 simultaneous clients, given UNIX's open-file limit? (Yes, this IS a requirement!); and (3) a lock request will now take 4 system calls' worth of overhead, rather than 1 (client writes request, server reads, server writes response, client reads). Spiros Triantafyllopoulos has contributed to the discussion by saying that INGRES is slow, and spends too much time in inter-process communication. Spiros appeared to be agreeing with Doug, yet Spiros' goals are diametrically opposed to Doug's. Doug brushes aside the performance issue by saying that a user-mode lock manager will perform "acceptably;" but will it be acceptable to Spiros? The point of all this is not to insist that the pseudo-device driver is a good idea, or to say that all other solutions are unworkable. Rather, I hope I have shown that the issues are far from cut-and-dried: that every possible approach, including the pseudo-device driver, has serious disadvantages. As I mentioned above, we are actively investigating alternatives. We welcome constructive suggestions as well as just expressions of preference, particularly when they shed more light than heat on this complex subject. Jim Shankland ..!ucbvax!mtxinu!rtech!jas ..!ihnp4!pegasus!rtech!jas