Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!utcsri!vassos From: vassos@utcsri.UUCP Newsgroups: comp.databases Subject: Degree 3 consistency (serialisability) Message-ID: <4996@utcsri.UUCP> Date: Sat, 27-Jun-87 14:00:53 EDT Article-I.D.: utcsri.4996 Posted: Sat Jun 27 14:00:53 1987 Date-Received: Sat, 27-Jun-87 17:35:41 EDT Organization: CSRI, University of Toronto Lines: 64 In response to my question as to the problems of "degree 3" consistency in connection with on-line trasnsactions, Gary Puckering gave some interesting scenarios illustrating these problems. It seems to me that the main point of his examples is that the trouble is caused by transactions holding exclusive locks on records and/or indices for too long, thereby forcing other transactions to wait too much. One of the remedies he suggests is lowering the degree of consistency. I disagree with this remedy for the following reasons: The only correct way of interleaving transactions in a general purpose system is by enforcing serialisability. (Let me state the sense in which I use the word, as it differs a bit from the definition Gary gave: An execution of transactions is serialisable if it is computationally equivalent to a serial execution of the same transactions. Note that the equivalence in question refers, not just to updates of transactions, but also to what they read. So my use of "serialisability" takes care of phantoms -- see Gary's article for what these are.) Kung and Papdimitriou have proved a theorem in effect stating that in the absence of detailed knowledge about the integrity constraints on the database and of the semantics of transactions, anything less than serialisability violates consistency. Thus, "degree 2 consistency" is really a degree of inconsistency! If we accept that the integrity of data is valuable (which, as database people, we do!), we can't afford to let silly things like bad interleavings of transactions mess up the integrity of our data. It's bad enough having to cope with inconsistencies due to data entry errors and the like! This is the main reason why I think system designers who make serialisability the default in concurrency control have actually made the right choice. In connection to the issue of performance, the following point (made by Jim Gray in his paper "A transaction model") is relevant: "In the experiments we have done, degree 3 consistency has a cost (throughput and processor overhead) indistinguishable from the lower degrees of consistency." Even though I know nothing about the experiments mentioned, I put much weight in this statement, coming as it does from the mouth (pen? keyboard?) of the person most responsible for the formulation of the theory of "degrees of consistency". Of course, the performance problem Gary has pointed out is still there. One possible avenue that could lead to a reasonable solution without compromising serialisability is through the use of mutliple versions. That is, each updater of a record does not destroy the old version of that record but creates a new one. There are various concurrency control algorithms that ensure serialisability while reducing the amount of waiting, by exploiting the existence of such multiple versions. Of course such an approach has its own costs (storage for the multiple versions, and complexity in the concurrency control algorithm) but it seems reasonable that one would have to pay a price for higher performance. In my opinion, however, that price should not be the price of inconcistent data. Apparently, Prime has a database system that employs a multiversion technique. It would be interesting to know if there are other commercial systems that do, and whether this leads to appreciable performance improvements for situations of the sort described by Gary. -- Vassos Hadzilacos vassos@csri.toronto.edu