Path: utzoo!utgpu!watmath!att!ucbvax!agate!usenet From: hughes@math.berkeley.edu (Eric Hughes) Newsgroups: comp.databases Subject: Re: Extended RDB vs OODB Message-ID: <1989Aug17.180057.2623@agate.berkeley.edu> Date: 17 Aug 89 18:00:57 GMT References: <28@dgis.daitc.mil> Sender: usenet@agate.berkeley.edu (USENET Administrator;;;;ZU44) Reply-To: hughes@math.berkeley.edu (Eric Hughes) Organization: UC Berkeley Math Dept Lines: 50 In-reply-to: jkrueger@dgis.daitc.mil (Jonathan Krueger) In article <28@dgis.daitc.mil>, jkrueger@dgis (Jonathan Krueger) writes: >Therefore I'd like to divide the question: efficient implementation of >a data model versus inherently bad performance of some models for some >operations. Recent traffic has confused the two issues without >addressing either. "Inherently bad performance" is a slippery term. It is important to remember that a database model is an abstraction, and that there are many different implementations of the same abstraction. To speak of "inherently bad performace" would require that one find some predicate which is invariant over all possible implementations and show that such an invariant has certain undesirable minimum time and/or space growth rates. I can see both information theory and complexity theory useful in this regard, but as far as I know there has been no work done in this area. For example, one could calculate some measure of the information present in a query and relate that to some critical path of operation to get a lower bound. In short, I don't think it makes much sense to talk of "inherently" bad performance, at least for now. The situation is not so hopeless, because such kinds of operation counting are possible when one fully specifies the type of implementation. For example, one could analyze the performance of an RDBMS whose implementation consisted of nothing but flat files with no indexes. You can get time and space estimates in terms of the file sizes. By adding indices to the implementation one can reduce some O(n) operations to O(log n), and by adding pointer rings to represent a one-to-many relationship (redundant data for speed sake) one can reduce some O(log n) operations to O(log log n) (which, for most applications, is as good as constant). This is not inherent performance, but might be termed "practically inherent." Let's be careful when we talk about the performance of a model. Only programs have performance measure per se, and so to measure the performance of model one must somehow relate it to the performance of some programs. Finally, I have a two-part conjecture: Asymptotic performance measurements (i.e. "order of" type measurement) of the relational and object-oriented models (and all their children :-) are identical. No model has a set of constants for such asymptotic measurement (i.e. the constant that gets rid of the "order of" symbol) whose inverses dominate (i.e. are everywhere better than) the respective constants of all other models. To paraphrase, both models have inherently the same order of magnitude performance, and certain models are better suited to certain operations than others. Eric Hughes hughes@math.berkeley.edu ucbvax!math!hughes