Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.csd.uwm.edu!cs.utexas.edu!uunet!odi!jack
From: jack@odi.com (Jack Orenstein)
Newsgroups: comp.databases
Subject: OO DBMSs (was Re: Extended RDB vs OODB)
Message-ID: <1989Aug21.132525.3179@odi.com>
Date: 21 Aug 89 13:25:25 GMT
References: <3560052@wdl1.UUCP> <411@odi.ODI.COM> <19@dgis.daitc.mil> <1989Aug14.140128.15094@odi.com> <27@dgis.daitc.mil>
Reply-To: jack@odi.com (Jack Orenstein)
Organization: Object Design Inc., Burlington, MA
Lines: 105


Here are replies to some recent questions that have come up in the OO
DBMS discussions. The answers are, for the most part, specific to the
OO DBMS being built at Object Design, but will often, I believe, apply
to competitors' products as well.

David Masterson writes:
   
   Based on Jack Orenstein's message, I have a couple of questions:
   
   1. In implementing an OODB on top of C++ using the notion of persistent and
   transient type objects, when you refer to information in the OODB, is it
   always by an object identifier?  How, therefore, would you find objects
   meeting some qualification if you don't know its identifier?  Is this even a
   type of query you would ask in an OODB world?  (you ALWAYS know the identifier
   because even a qualification would be wrapped in an object which contains the
   identifier?)

It will often be the case that object ids are known because they are
stored in persistent variables. For example, a persistent variable of
type part* stores the id of a part.

In other cases, an id will not be known, but properties of the object
can be described as part of a query. Queries are expressed using
existing C++ syntax for control (i.e. boolean) expressions. For
example, given a set of parts (which may contain both transient and
persistent instances), queries can be written to ask for all parts
whose weight exceeds a given amount, all parts containing a given
sub-part, all parts contained in a given part, etc. Compound queries
can be expressed also, e.g. find all parts containing a frammis-joint
linkage that were manufactured by Acme.

   
   2.  Again using the architecture of persistent and transient objects,  is a
   persistent object ever in memory?  Or is it just a transient copy of a
   persistent object that is in memory?  Then, how are persistent objects
   created?

Yes, the persistent objects themselves are manipulated by
applications. Copying isn't good enough since a copy of an object has
a different identity. (This might not be true in other languages, but
the idea of equating an object's id with its address is fundamental to
C++. Of course, it is possible to define a base "object" class, define
it to have an "id" data member, redefine initialization, =, and == to
work off this id, and then use "object" to derive all other classes,
but the space and time overhead will be significant). One example of
the difficulties that arise is that pointers to an object do not point
to copies of the object.

Copies of objects can be made, as is usually the case in C++, and the
semantics of C++ are preserved. I.e., the copy is a distinct object.
   
   
From D. C. Martin:

   Dan Weinreb of ODI writes:
   
       There should not be any special declaration for
       "pointers to persistent" or "pointers to possibly persistent" data as
       distinct from ordinary pointers.
       
   It would be nice if no one ever had to consider if a pointer was persistent
   or non-persistent, but someone will have to build the access methods and
   other low-level interface routines to your storage manager in order to
   provide this type of "pointer swizzling" to the application developer.
   At UW - Madison the Exodus Project is developing a language called E, which
   is a persistent C++ language designed to allow an individual to write an
   her own access methods, and to a certain extent pointers to resident objects
   are equivalent to persistent.  However, for this equivalency the pointer
   types must be DB pointers, i.e. dbchar* != char*, but a persistent dbchar*
   is equivalent to a non-persistent dbchar*.

We are very familiar with the Exodus project, and with the E language.
While the type system of E is far preferable to that of a typical
host-language/DBMS combination, it still has two distinct, but
"parallel", type systems, and programmers have to be careful about the
use of db types. In our product, there will be a single type system,
that of C++. There is no fundamental reason why persistent and
transient types have to be distinguished in the language used by the
application programmer.

Unfortunately, the details of how "swizzling" works are proprietary,
so I can't discuss the issue.

   
       In particular, de-referencing a
       pointer has exactly the same semantics and syntax regardless of
       whether the objects are persistent or transient.  In general, data
       manipulation (storing, fetching, testing, adding, printing, field
       extraction, function calling, casting) looks exactly as it does for
       normal C++.
   
   What about dereferencing a pointer to a 40mb image?  Does this mean
   bringing the entire image into core?  There must be some low-level routines
   to allow the application programmer to inform the language that certain
   special methods should be used to store, fetch, etc... for special
   datatypes.

I'll have to take the 5th again, but I will say that there is no need
to bring in the entire 40mb image just to retrieve one byte of it.
   
   
Jack Orenstein
Object Design, Inc.