Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ncar!ames!pasteur!postgres!larry
From: larry@postgres.uucp (Larry Rowe)
Newsgroups: comp.databases
Subject: Re: Recent discussions... (*long* controversial message)
Message-ID: <16912@pasteur.Berkeley.EDU>
Date: 7 Sep 89 01:11:52 GMT
References: <16753@pasteur.Berkeley.EDU> <1989Sep5.214702.1377@odi.com>
Sender: news@pasteur.Berkeley.EDU
Reply-To: larry@postgres.UUCP (Larry Rowe)
Organization: Postgres Research Group, UC Berkeley
Lines: 85

here are some comments and further questions...

In article <1989Sep5.214702.1377@odi.com> dlw@odi.com writes:

>  Object-oriented databases, at least for
>the forseeable future, are mainly to fill new needs.

how big a market can this be?  if you are selling a development tool
to the CAx companies, they will have to pay you a runtime license
fee.  since the packages that use your runtime system will run on
workstations and be priced under $2.5K per system, your OEM revenue
will probably be under $500 per copy (more likely in the $100-$200 range).
at that price point, you'll have to sell 100K to make $50M.  now,
how many total machines have apollo and sun sold in their liftime?
maybe 500K?  assuming that 50% are running a product with an OODBMS
embedded in it, we're talking about a $125M market.  that's a good
place to start.  but, i sure would want to be confident that i was
going to dominate that market (i.e., own 30-60% of the market) or
i might not survive.  the problem that the "O" companies have is that
there are too many of them and probably only 1 that will survive.

so, the "O" companies are either going to have to get into markets with
2-5M machines (e.g., MAC's and PC's) or they are going to have to broaden
their products.  you may not want to go into the MIS/end-user market, but
i'm not sure how you can grow to be a $100M/year company if you don't.

>Regarding normalization in relational systems:
>
>							the trick is to 
>   precompute a main memory representation of the complex object and store
>   that in the dbms along with the normalized version.
>
>To be fair, though, there are some extra costs incurred by this trick.
>You have to make sure that this precomputed representation is
>recomputed (or cache-invalidated) whenever there's a change in any
>value it depends on.  So someone has to check when those values are
>changed; ideally there should be a trigger/integrity-like check, to
>prevent slip-ups due to manual error, but even then the checks must
>have some runtime cost.  There also must be some storage overhead cost
>for storing two different representations of the same data.  It's
>certainly a good trick and I'm sure it provides fine performance for
>some applications, but in a speed-critical area with many updates, or
>when the number of instances is very large, these costs would have to
>be considered as part of a tradeoff.

all valid points.  look at postgres to see how mike stonebraker and i
designed a database system to handle this problem.  yes, the trick only
works for things that can be replicated.  people who want to store 747
and submarine designs can't replicate their databases.

so, let's discuss storing precomputed joins which is essentially what the
OODBMS's propose to do when storing complex objects.  surprisingly enough,
that idea has been around in the relational world for a while and it was
implemented by oracle 6-8 years ago.  it doesn't seem to make that much
difference in performance because rtingres hasn't been blown out of the
water by oracle in benchmarks.  in spite of the oracle chest pounding about
their TP1 numbers, the information i've seen suggests that rtingres has been
faster than them for most of the past decade.  of course, dbms performance
doesn't always lead to a sale.

the OODBMS proponents response will be that TP1 and other MIS applications
don't have the kind of complex objects found in engineering applications
so the precomputed join strategy might not be cost effective.  my experience
is quite different.  MIS applications often have complex objects (e.g.,
application objects composed of 5-20 different object types with numerous
instances in on complex object) with shared subobjects.  these applications
would definitely take advantage of this mechanism.
don't misunderstand my point.  i am not saying that precomputed joins isn't
a viable strategy.  rather i'm saying it may not be as important as the 
OODBMS folks believe it is.  

but, my bigger point is that if this implementation strategy does become
a significant performance issue, the RDBMS folks will just implement it.

for my money, the major difference between the OODBMS's being developed by
the "O" companies and the RDBMS's currently being marketed is the
application program caching they are implementing.  most benchmarks that 
i've seen that supposedly show why OODBMS's win over RDBMS's (e.g., the
sun, tektronix, and maier benchmarks) seem to be totally dominated by
queries that must be implemented on a main memory database in the
application program address space.  current RDBMS application development
tools don't do this.  but, they probably will.  see my research at berkeley
over the past couple of years on the picasso shared object hierarchy which
is persistent CLOS with objects stored in postgres.
	larry