Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!elroy.jpl.nasa.gov!turnkey!orchard.la.locus.com!prodnet.la.locus.com!jfr From: jfr@locus.com (Jon Rosen) Newsgroups: comp.databases Subject: Re: how to implement cursor based UPDATE/DELETE Message-ID: <24334@oolong.la.locus.com> Date: 13 May 91 19:09:09 GMT References: <22626@shlump.lkg.dec.com> <12260@mentor.cc.purdue.edu> Organization: Locus Computing Corp, Los Angeles Lines: 72 In article <12260@mentor.cc.purdue.edu> tmp@mentor.cc.purdue.edu (Tom Putnam) writes: >In article <22626@shlump.lkg.dec.com> saunders@refine.enet.dec.com >(Kyle Saunders) writes: >> I was wondering how you would implement cursor based UPDATE's and >>DELETEs. I have given some thought to this and cannot see how the RDBMS can >>properly update the correct row given the fact that the table to be updated >>may not have _any_ indexes, and not all fields are guarenteed to be included >>in the query. This means there is no way to uniquely determine which row in >>the original base table is being updated. > >.... The work-around was to recognize that (at least under INFORMIX), >there is a field called ROWID for each row. I can select it, then use it to >give me an "absolute pointer" to a record. > >Of course, one should be absolutely paranoid about ever keeping a ROWID around >for any longer than this sort of use. Systems need to be able to physically >move rows to other locations as the database is updated. Kyle questions how cursor-based UPDATEs and DELETEs can be implemented. If this refers to how an RDBMS implements it, the answer is related to Tom's response... I.e., internally, the RDBMS must keep some kind of a relative row id that it uses to link the cursor-retrieved row back to the actual data row. In fact, this implementation is quite common. It even leads to those databases such as Informix and Oracle actually materializing the ROWID into a user-accessible column which can be used to do direct-access updates against rows without using RDBMS cursors and the UPDATE/DELETE WHERE CURRENT OF statement. Also, it is not as bad as Tom thinks to keep using this ROWID. Usually, the ROWID is a logical construction and has nothing (or little) to do with the physical location/inode/record number of the actual row. So moving the row will generally not change its ROWID. I have seen an application in Oracle that even uses ROWID to build links between tables (UGGGH) in order to improve performance. This requires that the ROWID be persistent. There can be difficulties if rows are deleted and the associated ROWID references in other linked tables are not (a "referential integrity" issue) since ROWIDs can be reissued once the original row is no longer available. A better solution would be a fine granularity TIMESTAMP... In fact, Postgres uses this solution to individualize rows and to allow for update/delete with full history. All versions of rows are kept (until explicitly purged) and you can browse through outdated versions of rows at will. Of course, the default is to look at the current version of a row. Unfortunately, the ROWID solution begs several questions. First of all, a materialized ROWID is by no means universal. Most RDBMS (including DB2, OS/2 EE, RDB, Ingres, etc.) do not have such an abstraction. In addition, neither ROWID nor updatable cursors deal with data integrity across COMMITs. When data is being updated on-line, it is not advisable to hold a cursor open across the workstation or terminal i/o (otherwise locks are kept for long periods of time). Thus, if you use a cursor to select some rows for display, you still should COMMIT prior to displaying them. Now, your updatable cursor is no longer updatable. In fact, you will either have to reread the rows with a new updatable cursor or use a direct UPDATE. Since you have no lock on the data while you are using it on-screen, someone else can come along, read and modify the row without you knowing it. Once again, either a very fine granularity TIMESTAMP or an update/change count column is needed to ensure that your update does not screw up some other data that has been changed since your read. Alternatively, you have to do an UPDATE with a WHERE clause on every column of the TABLE giving the original value in order to make sure that the row has not been changed (bad idea from a performance standpoint). As for the duplicate row issue, don't you know that relational data bases aren't supposed to have duplicate rows??? :-) Jon Rosen