Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!yale!cmcl2!uupsi!bse.com!eberard
From: eberard@bse.com (Edward V. Berard)
Newsgroups: comp.object
Subject: Re: Object-Oriented COBOL?
Summary: unfortunately, it is not a shot from the hip
Keywords: relational versus object-oriented DBMS
Message-ID: <0B010001.exdbt4@bse.com>
Date: 9 Nov 90 12:16:02 GMT
Reply-To: eberard@bse.com
Organization: Berard Software Engineering, Inc.
Lines: 98
X-Mailer: uAccess - Mac Release: 0.2.7


In article <13921@neptune.inf.ethz.ch>, marti@mint.inf.ethz.ch (Robert Marti) writes:
> 
> >Good luck, and welcome to the object-oriented vs. relational DBMS wars.
> >While a relational DBMS can be used with an object-oriented software
> >development effort, you will have to:
> >	a. write some additional software to interface with the relational
> >	   DBMS,
> >	b. very likely corrupt your design to accommodate the
> >	   relational DBMS
> 
> This sounds like a shot from the hip.
> 
> Do you have any evidence to substantiate your second claim?

Unfortunately, I have a great deal of evidence to support my claim, all of
it from real projects. Consider the following scenario:

A project requires persistent objects. A decision is made to use a relational
DBMS. "Persistent storage of an object" is interpreted to mean "storage of
the instance variables for the object>" Several assumptions are made, e.g.,
the entirety of the object's state is represented by the sum of its instance
variables, and that all instance variables will be in a form which is easily
stored in a conventional relational DBMS.

One of the system designers identifies a rectangle object. Internally, the
rectangle object stores it height and width as two separate instance variables.
It is these height and width instance variables which are stored in the relational
DBMS. Assuming that there is a simple scheme in place to associate a height-width
pair with a specific rectangle, all would appear to be well.

However, the designer of the rectangle object has included an "area" operation
in the interface for the rectangle, i.e., this operation returns the area
for the rectangle. The designer initially chose not to create an "area instance
variable," because the method which calculated the area would simply access
the height and width values and perform the caluclation.

Searching the relational DBMS for all rectangles whose height and/or width
met certain criteria is easy. Searching the relational DBMS for all rectangles
whose area meets some criteria is a different matter. Specifically, the searcher
must know the method for calculating the area, and what specific information
(i.e., height and width) must be extracted from the relational DBMS to perform
this calculation.

As you may have noticed, we are violating one of the fundamental principles
of object-oriented software engineering, i.e., information hiding. Those who
query the database must know a good deal about the underlying implementation
of the object. In the case of the rectangle's area, they must know that height
and width exist, and the algorithm by which these values may be manipulated
to determine the area.

Imagine what would happen with more complex objects with more complex operations.
Further, imagine what will happen if someone chooses to change the underlying
implementation (i.e., instance variables and methods) for the persistent objects.
The knowledge of the original implementation is disbursed throughout the application.

Under ideal circumstances, someone should be able to formulate queries on
persistent objects using only the information contained in the objects' interfaces.
There should be no need to concern one's self with the underlying implementation
of any given object.

Now, back to my point on the corruption of the design of the system. In an
ideal object-oriented approach, the designer of an object knows that all interactions
with an object take place through the object's interface. Further, the details
of the underlying implementation of the object are hidden from the outside
world. Therefore, the designer of the object should be free to choose the
appropriate methods, and to decide on what instance variables will be created/used
in these methods.

Given that the objects will be "stored" in a relational DBMS, the designer
of an object must think about the types of queries which might be placed against
the object. In the case of the area of a rectangle, the designer my now have
to consider creating an "area" instance variable (as opposed to making the
area algorithm public knowledge). The designer is no longer as free to make
decisions on the underlying implementation.

Also consider more complex objects, e.g., objects which are aggregations of
other objects. Under normal circumstances, a composite object may choose to
query the states of its component objects to determine some of its own state
information. Specifically, there may be no "high level" instance variables
which represent a specific state. Rather, the larger object determines its
state on demand through queries to its component objects.

I have seen quite a number of attempts to approximate storage of objects in
a relational DBMS. Unfortunately, all the attempts that I have seen to date
require that either the implementation of individual objects be compromised
(either the underlying implementation, or requirements placed on the interfaces
of the objects), or that only certain information may be made persistent.

				-- Ed


----------------------------------------------------------------------------
Edward V. Berard                                | Phone: (301) 353-9652
Berard Software Engineering, Inc.               | FAX:   (301) 353-9272
18620 Mateney Road                              | E-Mail: eberard@bse.com
Germantown, Maryland 20874                      | 
----------------------------------------------------------------------------