Path: utzoo!lsuc!spectrix!yunexus!geac!john From: john@geac.UUCP (John Henshaw) Newsgroups: comp.databases Subject: Re: Distributed Database Locking Summary: general comments Message-ID: <2465@geac.UUCP> Date: 18 Mar 88 19:29:24 GMT References: <207@tijc02.UUCP> Organization: GEAC Computers, Toronto, CANADA Lines: 130 Posted: Fri Mar 18 14:29:24 1988 In article <207@tijc02.UUCP>, pjs269@tijc02.UUCP (Paul Schmidt) writes: > First I feel there are two methods for locking distributed databases. > [much text deleted] I'll confine most of my comments to LOCKING strategies. There are of course, several non-locking approaches, but I assume that you are focusing on locking only. Some comments further down reference other methods... The preemptive locking strategy (PRE) has the benefit that deadlock cannot occur. However, we rejected it as a viable method on "practical grounds". In particular, PRE requires that the programmer know *all* the resources that will be required *prior* to the initiation of the transaction. This is rarely the case. As far as I know, all non-PRE locking schemes require deadlock detection, which is a pain, but a necessary one. PRE will lock resources at the earliest point in time (transaction initiation) and hold them for the duration of the transaction. This means that the "growing phase" of the 2-phase locking activity happens very quickly, and the transaction can proceed without ever being blocked. This suggests that PRE may hold transaction locks for a SHORTER time than other schemes, not longer. I suspect that the real answer to this issue could be resolved via simulation and that the major elements affecting this would be the probability of conflict between transactions, and the level of concurrency (number of transactions executing at once). You write: > By > locking resources when they are needed, one can avoid unnecessary locks > but a deadlock detection scheme would have to be used. Could you please explain what is meant by an "unnecessary lock"? I imagine that you really meant to address the *timely* initiation of locks so that objects (resources) are locked for as short a period as possible. Otherwise, I will disagree: "If ya gotta lock it, ya gotta lock it...", hence all locks *are* necessary. You write: > Locking of all resources at the beginning of the transaction has the > following advantages: ease of implementation and deadlock prevention. > This will be easier to implement since a deadlock detection algorithm > and transaction rollback scheme does not have to be implemented. PRE, while deadlock free, does not free the implementor from transaction rollback. While it is clear that deadlock requires transaction rollback, there are other reasons why one may roll back a transaction. For example, a end-user may deliberately choose to ABORT a transaction, or the application software may decide, due to various conditions (data dependencies, data values, etc.) to ABORT. Either case requires rollback. You write: > Locking of resources as needed has the following advantages: Resources > are locked only when being used, and the total transaction does not need to > be known at the start of the transaction. Since resources are only locked > when they are being used, processing time will not be lost when another > process is waiting on that resource. This description is somewhat inexact. Resources, once locked, are locked until Commit time of the transaction. Having locked a resource/object, other processes will be blocked accessing that object. (This assumes that the object has been either written, or is assumed to be updatable at the time of lock initiation.) You point out a major advantage of standard 2 phase locking - that the programmer need not know which objects/resources will be locked at the Start Transaction juncture. It might be useful in your analysis to distinguish between the types of 2 phase locking that you have in mind. For example, when a lock is granted on an object which is read, do you assume that you are going to update it? or do you convert read locks to write locks at update time? A reasonably good paper involving this issue is: "The Performance of Concurrency Control Algorithms for Database Management Systems" by Michael Carey, and Michael Stonebreaker, 1984, VLDB Publications (Proceedings of the Tenth International Conference on Very Large Databases). It introduces a useful simulation framework that you may find helpful in your efforts. You ask: > Questions: > > Are there other methods? Sure: tree-locking, timestamp, optimistic, and multiversion are all written up. I suggest that you check out the library, looking at the last few years of: "Transactions on Database Systems". If you can get hold of "Distributed Systems: Volume 2", edited by Wesley Chu, ISBN 0-89006-213-7, chapter 2 talks about concurrency control, and other chapters address several case studies. Author names that may be useful are: Bernstein, Carey, Chu, Eager, Ellis, Goodman, Gray, Kung, Moss, Obermarck, Sevcik, Stonebreaker, Robertson, and Weiderhold, to name only a few! > What are the distributed deadlock detection schemes? The two I know about are "wait-for-strings", and "wait-for-graphs". Most systems that do deadlock detection, don't check for deadlock every time a lock is placed. Instead, a transaction, when blocked, times out, and then the check is made. In the distributed case, the lifetime of the average transaction may be quite long, and communication delays compound the difficulties that this approach already has. This approach of waiting is done in order to minimize the cost of checking graphs for cycles - usually a CPU-intensive process. > How can one tell when one method is better than the other? Good question. I think the answer to this question depends a lot on your criteria - something that differs for each situation. Reliability, robustness, efficiency, correctness, etc., etc., etc.... are all in the list somewhere. Judging DISTRIBUTED locking schemes is like judging CENTRALIZED locking schemes, except a few of the base rules/assumptions are changed. > How do crash recovery and deadlock detection rollback relate? Are most of > the recovery problems already solved by crash recovery? > Judging from your question, you are more interested in Distributed Transaction Management - not just "distributed locking". The "short" answer to this question is "yes", but the implementation details are where the real heart of the matter lies. Good luck with your work Paul. -john- -- John Henshaw, (mnetor, yunexus, utgpu !geac!john) Geac Computers Ltd. If we don't pay for education now, are we Markham, Ontario, Canada, eh? going to be able to pay for ignorance later?