Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!dciem!nrcaer!cognos!garyp
From: garyp@cognos.uucp (Gary Puckering)
Newsgroups: comp.databases
Subject: Re: chest thumping (Actually, degree-3 consistency)
Message-ID: <939@cognos.UUCP>
Date: Fri, 19-Jun-87 00:37:43 EDT
Article-I.D.: cognos.939
Posted: Fri Jun 19 00:37:43 1987
Date-Received: Tue, 23-Jun-87 03:02:32 EDT
References: <959@killer.UUCP> <853@smokey.UUCP> <4915@utcsri.UUCP>
Reply-To: garyp@cognos.UUCP (Gary Puckering)
Organization: Cognos Incorporated, Ottawa, Canada
Lines: 98

In article <4915@utcsri.UUCP> vassos@utcsri.UUCP writes:
>> [...] identifying problems that are of a generic nature
>> (such as problems with the degree-3 consistency model in on-line
>> applications) are of interest to everyone [...]
>
>What is the problem with serialisability (which, I believe, is what
>degree-3 consistency is) for on-line applications?

I was hoping someone would ask.

Actually, degree-3 consistency is not exactly the same as serializability.
Two transactions are said to be serializable as long as an interleaved
execution of their updates produces the same result as a serial execution.

Degree 3 consistency refers to a level of isolation that is equivalent
to each transaction being the sole user of the database.  In particular,
degree 3 consistency prevents the problem of "phantoms".  To illustrate
a phantom, consider a transaction which executes the following query:

    select all employees with salary > 50,000

If the same query is repeated within that transaction, it is possible
that some other concurrent transaction may have inserted a new employee
with a salary > 50,000 between the time of the first execution of the
query and the second execution.  Thus, the new employee will show up --
a "phantom".  If the two transactions executed serially, this could not
have happened.


Now, as to problems with on-line applications.  Theoretically, there
is nothing wrong that I can think of with degree 3 consistency.
Practically speaking, though, most implementations of relational
systems that I've seen (Rdb/VMS, DG/SQL, HP Allbase, and what I've
read or heard about Oracle, Ingres, DB2 and SQL/DS) have one or more
problems associated with implementation of degree 3 consistency that
cause significant reductions in concurrency.

A typical example:  in Rdb/VMS if several transactions are attempting
to insert tuples into a relation with a unique index, it is likely
that one or more may become deadlocked or go into a long wait state --
even if there is no collision on unique key values!  Why?  Well, to
prevent phantoms, Rdb/VMS must lock indexes for inserts.

Another example:  suppose you want to develop a screen that lets the
user browse through the database and, if he/she so desires, to update
any tuple that might be seen.  Ok.  This suggests the use of a
read_write transaction.  But, use of a read_write transaction causes
each tuple (probably page, depending on the lock granularity of your
system) to be S locked.  That's so someone else doesn't come along and
change it on you (remember degree 3 consistency).  So what if the user
goes to lunch before committing the transaction?  All those tuples/pages
that he's read are S locked, preventing anyone from updating them.

Now lets suppose the user comes back from lunch.  He continues browsing
and comes across a record he wants to change.  He changes it.  This
tuple/page now gets an X lock.  In most systems, that means other
read_write transactions will have to wait until the lock is released
before they can read that tuple/page.  More delays for concurrent users.
Well, lets say this user is aware that he shouldn't leave uncommitted
updates around too long, so he commits the transaction.  Ok.  That
frees all the locks.  But, it also releases his cursor.  So, if he was
only halfway through browsing along the particular predicate path, he
now has to restate is predicate to eliminate the tuples he's already seen.

This approach doesn't look too good -- although it's the most obvious
one.  Lets go back to the beginning.  Why not use a read_only
transaction for the browse.  That means no S locks are held.  The
system returns a version of each tuple that was current as at the
start of the transaction.  Ok, so how do we update?  Well, we could
start a read_write transaction each time the user requests an update.
Fetch the tuple that the user wants to update (you'll need a unique
key value or a database key) and check to make sure it isn't different
from the one the user fetched on his read_only transaction (a distinct
possibility) and update it.  Commit the read_write transaction.  Now
the user can continue on the browse transaction (it still hasn't been
committed, so the cursor is still valid).  This approach works but
puts the onus on the application to achieve concurrency.  It also
increases the risk that the users transaction will fail because some
concurrent transaction has changed data out from under him.

Yet another approach is to lower the isolation level of the
browse/update transaction to degree 2.  Transactions of this sort have
serializable updates, but may see phantoms.  In this application, I
don't consider phantoms to be a problem.  The advantage to degree 2 is
that you don't place S locks on tuples that are read and cursor
stability can be maintained after a commit.  The catch:  most
relational systems don't allow degree 2 consistency.  They enforce
degree 3.  Those that do make degree 3 the default, and most
programmers don't know any better.  The result:  poor concurrency.

In my humble opinion, a large part of the perceived "poor" performance
of relational systems may be related to the problems I've just
described.

-- 
Gary Puckering        3755 Riverside Dr.
Cognos Incorporated   Ottawa, Ontario       decvax!utzoo!dciem!
(613) 738-1440        CANADA  K1G 3N3       nrcaer!cognos!garyp