Path: utzoo!attcan!uunet!odi!dlw From: dlw@odi.com (Dan Weinreb) Newsgroups: comp.databases Subject: Re: Comment on the "Third-Generation Database System Manifesto" Message-ID: <1990Sep24.071412.3561@odi.com> Date: 24 Sep 90 07:14:12 GMT References: <21178@hercules.csl.sri.com> Reply-To: dlw@odi.com Distribution: comp Organization: Object Design, Inc. Lines: 126 In-Reply-To: cimshop!davidm@uunet.UU.NET's message of 24 Sep 90 03:43:50 GMT In article cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: Examples please. There's too much "feeling" in this paragraph. By the way, one criticism I have of the OO paper is the lack of credible reasoning on why a system implementing the relational model (the current relational systems are not really *completely* relational) is not a basic OODB that can be built upon to full object-oriented status. It seems to me that the OO paper is simply an attempt to define the term OODB. It does not comment, one way or another, about whether it is possible to build an OODB on top of some other kind of database, or to extend a non-OODB into an OODB by adding things. I'm sure the authors all have opinions about that, but the "manifesto" only addresses the question of what an OODB is. Actually, the immutable, system-assigned primary key was a central part of the Codd/Date RM/T model (Date's Introduction to Database Systems V2), so the UID concept doesn't disagree with the relational model. Well, the RM/T model is not the same thing as the Relational model. The people are the same (so to speak) but RM/T is a distinct model, with new elements (such as the system-assigned primary key) designed to remedy some problems of the relational model. The OODB's also provide the equivalent of a system-assigned primary key. Proposition 2.1 of the manifesto says that "Essentially, all programmatic access to a database should be through a non-procedural, high-level access language". It argues that the navigational approach used in OODBs is undesirable and inefficient comparing with the use of non-procedural query languages in relational systems. In my own opinion, this is one of those issues that is clouded up by the use of code words of adherents of dogma. The extended-relational people criticize the OODB people because an OODB can say things like "get me the father of X", which they denounce because it is imperative rather than declarative. In a relational database, you'd say something like "for all people in the database, find all people Z such that Z's father-id attribute has the value X, and print out (?) the value of the primary-key-id field for all of them" in order to do the same thing. The latter (my English translation of SQL) is considered ideologically pure because it is considered to be declarative rather than imperative. It is also argued that it will run faster because of the omniscent wisdom of a general-purpose query optimizer. As you can guess from my incindiary tone (forgive me), I remain unconvinced that this is true. On the other hand, I certainly do feel that for some database operations, it is highly desirable to use a high-level, "non-procedural" (in the sense in which they mean it) query construct and that it is appropriate to use a query optimizer, and that the optimizer can do the best job. It depends on the operation. But applying the "declarative is always virtuous, procedural is always evil" principle, particularly without clear definitions of the those terms, leads to neither enlightenment nor efficient execution. Some people seem to think that anybody who is an advoctate of OODBs is utterly opposed to nonprocedural query languages. I don't know why. My feeling is that the application writer should be provided with a toolkit that provides a range of useful tools, so that he or she can use the appropriate tool for the particular job at hand. Any serious OODB should have a query language and a query optimizer. A relational system can take advantage of the same clustering idea. "Time consuming optimizations", IMHO, are in the eye of the beholder. An OODB wouldn't have the full understanding of relationships that can occur in a database and, so, couldn't take advantage of a recursive join operation (unless the OODB was a relational DB). And why wouldn't an OODB have a full understanding of relationships? Your paragraph above must be read in the light of some understanding of "what is an OODB" and "what is a relational DB". You say that it can't have understanding unless it is a relational DB. Do you seriously mean that it has to follow Codd's Rules or else such an optimization is impossible? I suspect you mean something less stringent. Is what you mean incompatible with the definition of OODB, as given in the OODB manifesto? Do you think that relational systems could not improve themselves with the above solutions? This statement is another one that always crops up in these discussions. For about 2/3 of the advantages that the OODB people claim, the relational ("post-relational"?) people answer by saying "well, we could do that too". What this really points out is two things. First, many of the differences between the two camps don't actually hinge on the question of data model (OO vs. relational), but really have to do with totally unrelated questions of implementation technique, programming language interfaces, and so on. Second, for application programmers who are trying to obtain a DBMS in some particular year to accomplish some particular job, the key question is not what is theoretically possible but what can be acquired promptly. Proposition 2.4 of the manifesto says that "Performance indicators have almost nothing to do with data models and must not appear in them." I disagree with this claim. Although performance is heavily influenced by individual implementation techniques, there exists inherent limitations on the performance achievable for the underlying data models. As I said above, I don't agree with this. The data model merely specifies the external interface to the DBMS. There are many, many clever tricks that can be used to implement such an interface to make it run more quickly. The only things that are inherent in the data model are whether it is possible to say certain things, and how clear and convenient it is to say them. It would be impossible to somehow prove that nobody could come up with a clever implementation, on the might work in a totally non-obvious way, to make some particular database operation fast. When you look at performance, you really have to examine the speeds of particular DBMSs on particular tasks. As for OODBs, although the lack of declarative query languages and a formal object algebra/calculus make it less intuitive for end users to use currently, Oh, come now. Several existing OODBs have declarative query languges. As for the great benefits of formal algebra/calculus, please read Codd's essay criticizing SQL. SQL is based neither on the relational algebra nor the relational calculus but on a sort of hybrid of both; it's inconsistent and counterintuitive in many ways. SQL is not prevailing in the relational DBMS industry right now because of its clarity or its theoretical purity, but because of political factors involving IBM endorsement and the need for standardization of interfaces. OODB query languages are not less intutive than SQL. These are purely my own late-night musings and are not necessarily the official position of my employer. Dan Weinreb dlw@odi.com Object Design, Inc.