Xref: utzoo comp.object:3631 comp.databases:10326 Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!udel!rochester!pt.cs.cmu.edu!o.gp.cs.cmu.edu!netnews.srv.cs.cmu.edu!clamen From: clamen@CS.CMU.EDU (Stewart Clamen) Newsgroups: comp.object,comp.databases Subject: Re: Schema/Type Evolution in Traditional and O-O DBMSs Message-ID: Date: 30 May 91 02:50:24 GMT References: Sender: netnews@cs.cmu.edu (USENET News Group Software) Reply-To: clamen+@CS.CMU.EDU Organization: School of Computer Science, Carnegie Mellon University Lines: 424 In-Reply-To: clamen@CS.CMU.EDU's message of 21 May 91 17:32:07 The following is the result of my public survey into the schema evolution and database conversion support exhibited by known research and commerical DB and OODBMS. Further contributed are welcome. --------------------*-*--------------------- Direct quotes are attributed by including the email address of the poster directly following the information. Prose written by the poster, but with primary information provided by email are so identified. Information gleaned from publications are so noted. The information included here is not intended to completely describe the systems addressed, but rather, to describe what support, if any, is provided by the system for the evolution of schemas and the conversion of database objects (class instances) resulting from the schema change. SMC ----------*-*---------- <<< EXTENDED RELATIONAL DB MODEL >>> << Research Systems >> > POSTGRES (Berkeley) You ask explicitly about type evolution. We support schema modification on all classes, including user classes. This means that you can add attributes (instance slots) and methods at any time. Further, since postgres is a shared database system, such changes are instantly visible to any other user of the class. The language syntax supports attribute deletion, but the system won't do it yet. Since all data is persistent, removing attributes from a class requires some work -- you need to either get rid of or ignore all the values you've already stored. [mao@postgress.berkeley.edu] < <<< OO DATA MODEL >>> << Research Systems >> > COOL/COCOON (ETH Zurich) No implementation as yet. Project goals are: - to develop a general formal framework for investigations of all kinds of schema changes in object-oriented database systems (including schema design, schema modification, schema tailoring, and schema integration); - to find implementation techniques for evolving database schemas, such that changes on the logical level propagate automatically to adaptations of the physical level (without the need to modify all instances, if possible). Contact Markus Tresch for more information. > Encore (Brown) Objects are never converted, rather, classes are versioned, and the user can specify filters to make old-style instances appear as new instances to new applications (and vice versa). REFS: Andrea H. Skarra, and Stanley B. Zdonik. "Type Evolution in an Object-Oriented Database." In the book, "Research Directions in Object-Oriented Programming", by Shriver and Wegner. (An earlier version of the paper appears in the proceedings to OOPSLA86.) [clamen] > ORION (MCC/Itasca System, Inc.) ORION is a prototype OODBMS developed at MCC, an American consortium. It is built on top of Common Lisp, and is intended to support applications such in the CAD/CAM, AI, and OIS domains. Advanced functions supported include [object] versions, change notification, composite objects, dynamic schema evolution, and multimedia data. For schema evolution, ORION identifies a list of database-consistency constraints that must be preserved across any class evolution operation. They then list the type of evolution operations you can perform, and how the relevant instances can be converted. Conversion is performed as the instances are accessed. I have found nearly a dozen papers published by the ORION folks. The most recent and general one is: W. Kim, N. Ballow, H-T. Chou, J.F. Garza, D. Woelk, and J. Banerjee. "Integrating an Object-Oriented Programming System with a Database System." Proceedings of OOPSLA88. [Pointers to the previous papers documenting each of the advanced features listed above are cited therein.] The paper most relevant to the issue of schema evolution is the following: J. Banerjee, W. Kim, H.J. Kim, H.F. Korth. "Semantics and Implementation of Schema Evolution in Object-Oriented Databases." Proceedings of SIGMOD87. [clamen] > Exodus (UWisc) No solution for the problem of schema evolution is provided. Emulation is rejected by the authors, who claim that the addition of a layer between the EXODUS Storage Manager and the E program would seriously reduce efficiency. Automatic conversion, whether lazy or eager, is also rejected, as it does not mesh well with the C++ data layout. To implement immediate references to other classes and structures, C++ embeds class and structure instances within its referent. The resulting change in the size of the object might invalidate remote pointer references. Joel E. Richardson and Michael J. Carey. "Persistence in the E langauge: Issues and Implementation." Appeared in "Software -- Practice and Experience", 19(12):1115-1150, December 1989. [clamen] > Machiavelli (UPenn) Machiavelli is a statically-typed persistent programming language project at the University of Pennsylvania. It does not address type evolution. [communication with limsoon@saul.cis.upenn.edu] > ConceptBase We have developed a deductive object-oriented database called ConceptBase where everything (tokens, classes, meta-classes ,meta-meta-classes ,attributes, instantiations, specializations) is treated as an object. That means that you may update the "schema" (classes) at any time just as any other ordinary object. The systems has (user-defined and builtin) integrity constraints that prevent inconsistency (e.g. violation of ref.integrity). Integrity constraints in ConceptBase are (as in most other systems) static, i.e., they are conditions that each database "state" must satisfy. The data model we use does not distinguish schema level information (i.e. classes) from instance level information. If you change for example some classes and this change violates some integrity constraints, e.g. some instances now don't have the right attribute types anymore, then you have the choice either to reject the update or to change the existing DB. Currently, ConceptBase simply rejects such updates. We are thinking of exploiting abduction (see VLDB'90 article of Kakas&Mancarella) to make more clever reactions in the sense of "reformating" instances. [Manfred Jeusfeld ] > AVANCE (SYSLAB) An object-oriented, distributed database programming language. Its most interesting feature is the presence of system-level version control, which is used to support schema evolution, system-level versioning (as a way of improving concurrency), and objects with their own notion of history. System consists of programming language (PAL) and distributed persistent object manager. REFS: Anders Bjornerstedt and Stefan Britts. "AVANCE: An Object Management System". Proceedings of OOPSLA88. [clamen] > Altair/O_2 (INRIA) Neither of the two articles I have (bibliographic information below) address the issue of schema evolution or database conversion. REFS: F. Bancilhon, G. Barbette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeffer, P. Richard, and F. Velez. "The Design and Implementation of O2, and Object-Oriented Database System". Advances in Object-Oriented Database Systems, Springer Verlag. (Lecture Notes in Computer Science series, Number 334.) C. Lecluse, P. Richard, and F. Velez. "O2, an Object-Oriented Data Model". Proceedings of SIGMOD88. Also appears in Zdonik and Maier, "Readings in Object-Oriented Database Systems", Morgan Kaufmann, 1990. [clamen] > OTGen (CMU) OTGen describes a scheme for computer-assisted schema evolution. A wide variety of changes (wider than those supported by Orion or GemStone) can be expressed in the evolution "mini-language", which describes a procedure for transforming instances from their new to old representations. Objects are converted as databases (which in the invisioned OTGen system are rather small) are opened. REFS: Barbara Staudt Lerner and A. Nico Habermann. "Beyond Schema Evolution to Database Reorganization" in Proceedings of OOPSLA/ECOOP '90. [clamen, blerner@cs.umass.edu] << Commercial Systems >> > CLOS Not persistent, but implementations must support redefinition of classes and the conversion (either lazy or eager) of existing instances. [c.f. CLtL II] In spite of this freedom, implementations seem to convert lazily. [communication with gregor@parc.xerox.com, hornig@symbolics.com, dussud@lucid.com] > Statice (Symbolics) I'm familiar with Statice, sold by Symbolics Inc. The Statice command "Update Database Schema" brings an existing database into conformance with a modified schema. Changes are classified as either compatible (lossless, i.e., completely information-preserving) or incompatible (i.e., potentially information-losing in the current implementation). Basically, any change is compatible except for the following: -- If an attribute's type changes, all such attributes extant are re-initialized (nulled out). Note that Statice permits an attribute to be of type T, the universal type. Such an attribute can then take on any value without schema modification or information loss. -- If a type's inheritance (list of parents) changes, the type must be deleted and re-created, losing all extant instances of that type. This is Statice's most serious current limitation. The simplest workaround is to employ a database dumper/loader (either the one supplied by Symbolics or a customized one) to save the information elements and then reload them into the modified schema. [lgm@iexist.att.com] > Versant Versant provides schema evolution. But in the current release, only leaf classes in the schema can be modified. Leaf classes can be added, dropped, renamed and individual attributes and methods changed. The class instances are modified later as they are accessed. There are no security mechanisms for preventing users from changing schema. Schema changes are done using a separate utility which compares files (with .sch extension) which contain new schema definitions with those of a database and changes the database schema so that there is no difference. In case of conflicting class names or other situations user has control on resolving the conflict. [h.subramanian@trl.OZ.AU] I've been looking at the C++ database vendors. Versant has schema evolution at the leaf class level only. They're trying to come up with a good way to do it for the general case. They talk about using versioning to mark class evolution. Then they want to test timestamps when an object is retrieved to see whether its class has been changed. If it has, they reformat the object to conform to the new definition at that time. [arc!chet@apple.com] > Object Design Object Design, to the best of my knowledge, do[es]n't support schema evolution at this time. [arc!chet@apple.com] > Objectivity Objectivity, to the best of my knowledge, do[es]n't support schema evolution at this time. [arc!chet@apple.com] > ObjectStore ObjectStore does not provide schema evolution as yet but it has promised to provide schema evolution in the next release. [h.subramanian@trl.OZ.AU] > Ontos [formerly VBase] (Ontologic) Ontos provides schema evolution. It allows any class to be modified. The major drawback is that data does not migrate ie., instances are not modified to adopt to the new class definition. So schema changes can be done only on classes that do not contain instances and do not have sub classes that contain instances. [h.subramanian@trl.OZ.AU] As a system for experiments, we are currently using ONTOS from Ontologic Inc. Unfortunately, there is no transparent concept of schema evolution for populated database. Thus, we still investigate how it works. [Markus Tresch ] > GemStone (Servio-Logic) The authors reject the emulation scheme and the lazy conversion approach as previously outlined. Instead, they favor a mixed strategy, which involves lazy conversion until the next garbage collection, at which point all remaining old instances are upgraded. (Their current implementation, however, does not yet support this feature --- the conversion being done eagerly for the time being.) They identify a list of constraints which must be preserved across modification to type descriptions and to the inheritance hierarchy. The authors then proceed to enumerate a number of categories of object updates that are permitted, and what changes to the dependent instances and subclasses must be performed in order to maintain the integrity of the database (i.e., to preserve the above constraints). REFS: Robert Bretl, David Maier, Allan Otis, Jason Penney, Bruce Schuchardt, Jacob Stein, E. Harold Williams, Monty Williams. "The GemStone Data Management System." Chapter 12 of "Object-Oriented Concepts, Databases and Applications", by Kim and Lockovsky. [clamen] > Base/OPEN (NMP-CAD) A structurally object-oriented system (ie. methods are not stored), only schema extension is supported. Instances of older type-versions are never converted, but can coexist in the database with newer objects. [communication with tomas@basf.nmpcad.se] <<< OTHER MODELS >>> << Commercial Systems >> > Pick With Pick and its variants you only have problems if you want to redefine an existing field. Because of the way the data are stored and the separation of the data and the dictionary you can define additional fields in the dictionary without having to do anything to the data - a facility which we have found very useful in a number of systems. There is no general facility to redefine an existing field - you just make whatever changes are required in the dictionary then write an Info Basic program to change the data. We have seldom needed to do this, but it has not been complicated to do. [Geoff Miller ] > IDL (Persistent Data Systems) IDL is a schema definition language. Schema modifications are defined in IDL, requiring ad-hoc offline transformations of the database, in general. A simple class of transformations can be handled by IDL->ASCII and ASCII->IDL translators (i.e., integer format changes, list->array, attribute addition). [conversation with Ellen Borison of Persistent Data Systems] << Research Systems >> > IRIS (HP Labs) Objects in the Iris system may acquire or lose types dynamically. Thus, if an object no longer matches a changed definition, the user can choose to remove the type from the object instead of modifying the object to match the type. In general, Iris tends to restrict class modifications so that object modifications are not neccssary. For example, a class cannot be removed unless it has no instances and new supertype-subtype relationships cannot be established. REFS: D.H. Fishman, D. Beech, H.P. Cate, E.C. Chow, T. Connors, J.W. Davis, N. Derrett, C.G. Hock, W. Kent, P. Lyngbaek, B. Mahbod, M.A. Neimat, T.A. Tyan, M.C. Shan. "Iris: An Object-Oriented Database Management System". ACM Transactions on Office Information Systems 5(1):48-69, Jan 1987. [clamen] -- Stewart M. Clamen Internet: clamen@cs.cmu.edu School of Computer Science UUCP: uunet!"clamen@cs.cmu.edu" Carnegie Mellon University Phone: +1 412 268 3620 Pittsburgh, PA 15213-3890, USA Fax: +1 412 268 1793