Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!dciem!nrcaer!cognos!garyp From: garyp@cognos.uucp (Gary Puckering) Newsgroups: comp.databases Subject: Re: chest thumping (Actually, degree-3 consistency) Message-ID: <939@cognos.UUCP> Date: Fri, 19-Jun-87 00:37:43 EDT Article-I.D.: cognos.939 Posted: Fri Jun 19 00:37:43 1987 Date-Received: Tue, 23-Jun-87 03:02:32 EDT References: <959@killer.UUCP> <853@smokey.UUCP> <4915@utcsri.UUCP> Reply-To: garyp@cognos.UUCP (Gary Puckering) Organization: Cognos Incorporated, Ottawa, Canada Lines: 98 In article <4915@utcsri.UUCP> vassos@utcsri.UUCP writes: >> [...] identifying problems that are of a generic nature >> (such as problems with the degree-3 consistency model in on-line >> applications) are of interest to everyone [...] > >What is the problem with serialisability (which, I believe, is what >degree-3 consistency is) for on-line applications? I was hoping someone would ask. Actually, degree-3 consistency is not exactly the same as serializability. Two transactions are said to be serializable as long as an interleaved execution of their updates produces the same result as a serial execution. Degree 3 consistency refers to a level of isolation that is equivalent to each transaction being the sole user of the database. In particular, degree 3 consistency prevents the problem of "phantoms". To illustrate a phantom, consider a transaction which executes the following query: select all employees with salary > 50,000 If the same query is repeated within that transaction, it is possible that some other concurrent transaction may have inserted a new employee with a salary > 50,000 between the time of the first execution of the query and the second execution. Thus, the new employee will show up -- a "phantom". If the two transactions executed serially, this could not have happened. Now, as to problems with on-line applications. Theoretically, there is nothing wrong that I can think of with degree 3 consistency. Practically speaking, though, most implementations of relational systems that I've seen (Rdb/VMS, DG/SQL, HP Allbase, and what I've read or heard about Oracle, Ingres, DB2 and SQL/DS) have one or more problems associated with implementation of degree 3 consistency that cause significant reductions in concurrency. A typical example: in Rdb/VMS if several transactions are attempting to insert tuples into a relation with a unique index, it is likely that one or more may become deadlocked or go into a long wait state -- even if there is no collision on unique key values! Why? Well, to prevent phantoms, Rdb/VMS must lock indexes for inserts. Another example: suppose you want to develop a screen that lets the user browse through the database and, if he/she so desires, to update any tuple that might be seen. Ok. This suggests the use of a read_write transaction. But, use of a read_write transaction causes each tuple (probably page, depending on the lock granularity of your system) to be S locked. That's so someone else doesn't come along and change it on you (remember degree 3 consistency). So what if the user goes to lunch before committing the transaction? All those tuples/pages that he's read are S locked, preventing anyone from updating them. Now lets suppose the user comes back from lunch. He continues browsing and comes across a record he wants to change. He changes it. This tuple/page now gets an X lock. In most systems, that means other read_write transactions will have to wait until the lock is released before they can read that tuple/page. More delays for concurrent users. Well, lets say this user is aware that he shouldn't leave uncommitted updates around too long, so he commits the transaction. Ok. That frees all the locks. But, it also releases his cursor. So, if he was only halfway through browsing along the particular predicate path, he now has to restate is predicate to eliminate the tuples he's already seen. This approach doesn't look too good -- although it's the most obvious one. Lets go back to the beginning. Why not use a read_only transaction for the browse. That means no S locks are held. The system returns a version of each tuple that was current as at the start of the transaction. Ok, so how do we update? Well, we could start a read_write transaction each time the user requests an update. Fetch the tuple that the user wants to update (you'll need a unique key value or a database key) and check to make sure it isn't different from the one the user fetched on his read_only transaction (a distinct possibility) and update it. Commit the read_write transaction. Now the user can continue on the browse transaction (it still hasn't been committed, so the cursor is still valid). This approach works but puts the onus on the application to achieve concurrency. It also increases the risk that the users transaction will fail because some concurrent transaction has changed data out from under him. Yet another approach is to lower the isolation level of the browse/update transaction to degree 2. Transactions of this sort have serializable updates, but may see phantoms. In this application, I don't consider phantoms to be a problem. The advantage to degree 2 is that you don't place S locks on tuples that are read and cursor stability can be maintained after a commit. The catch: most relational systems don't allow degree 2 consistency. They enforce degree 3. Those that do make degree 3 the default, and most programmers don't know any better. The result: poor concurrency. In my humble opinion, a large part of the perceived "poor" performance of relational systems may be related to the problems I've just described. -- Gary Puckering 3755 Riverside Dr. Cognos Incorporated Ottawa, Ontario decvax!utzoo!dciem! (613) 738-1440 CANADA K1G 3N3 nrcaer!cognos!garyp