Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!decwrl!ucbvax!ulysses!swfc
From: swfc@ulysses.att.com (Shu-Wie F Chen)
Newsgroups: comp.databases
Subject: Re: Is RDBMS unproven technology?  (Flames to follow....)
Message-ID: <13532@ulysses.att.com>
Date: 3 Aug 90 20:06:22 GMT
References: <1073@ashton.UUCP> <10371@sybase.sybase.com>
Sender: netnews@ulysses.att.com
Reply-To: swfc@ulysses.att.com (Shu-Wie F Chen)
Organization: AT&T Bell Labs
Lines: 153

In article <10371@sybase.sybase.com>, tim@ohday.sybase.com (Tim Wood) writes:
|>In article <1073@ashton.UUCP> tomr@ashton.UUCP (Tom Rombouts) writes:
|>>Are relational databases an unproven technology regarding
|>>performance?  
|>>
|>>....A key tenet of the report is that RDBMS technology has been
|>>available for 20 years but still has not been proved in large,
|>>complex applications.  The report notes that users associate
|>>these products with poor system performance, even though they may
|>>be flexible and easier to implement.  The article then goes on 
|>>to cite a firm that is reluctant to replace IMS with DB2, and
|>>discusses other sites that use a mixture of relational and
|>>possibly non-relational systems.
|>

[some deleted stuff]

|>
|>Relational systems have so far been deployed in smaller-scale
|>applications than have hierarchical and network systems.  This is due
|>to several factors: relational is "newer" (that is, the technology
|>existed long before successful commercial products) and the older
|>database architectures were deployed in the days when nearly all
|>commercial computing resources were centralized and operating in a
|>batch-processing environment.  In that environment, updates and access
|>to the database are relatively rigidly controlled.

I don't see how these reasons (which are not incorrect) explain why
relational systems have so far only been deployed in smaller-scale
applications.

|>
|>The appeal of relational systems has been the promise of flexible
|>access to the database by users far removed from the DP department.

RDBMSs have made two contributions:

1. non-procedural access
2. data independence

I don't see what relational systems have to do with "the promise of
flexible access ... far removed from the DP department."  Are you
implying that network communication or client/server is restricted to
relational systems?

|>The trend toward decentralized access has been strengthened by the
|>growth of processing power directly available to individual users, and
|>by the changing nature of applications themselves.

Hmmm.  Last week I spoke with a Sybase tech support person who said that
Sybase's client/server architecture was geared toward having most of the
computation performed at the server end.  My response was "How about all
that CPU power directly available to the user?"  It seems that Sybase
feels that database computation should not be done at the client end(I
read this as personal workstation) because it would take away CPU cycles
for editing, reading news, etc.  They believe they can overcome the CPU
bottleneck at the server end.  This seems to contradict the above
statement by Tim (who works for Sybase).

[sorry for this digression, but Tim's position (which I agree with)
seems to differ from that of his company's]

|>Relational systems lend themselves well to distributed database, where
|>by definition there will be fewer, if any, centralized points of
  ^^^^^^^^^^^^^

Huh?  What definition?

I think relational systems lend themselves well to distributed databases
because they are set-oriented, rather than navigational systems like the
hierarchical and network models.  You can think in terms of sets of
tuples coming from each site instead of thinking on the level of
individual records.

|>transaction processing activity.  So, an individual site in the
|>distributed relational database may look "slow" compared to IMS on an
|>IBM-MVS 3090, but the aggregate throughput of the networked database
|>can be prodigious.  

Is this an argument for throughput over response time?  From the user's
point of view, it is much easier to gauge response time.

|>
|>This is not to say that there is some theoretical limit to the
|>performance of individual relational database engines.  Indeed, a major
|>focus of the industry now is to develop local transaction processing
|>speeds that rival those of older architectures on a platform of similar
|>scale.  Today's technology is proving (already has, actually) that the
|>assertion that relational is slow is out-of-date.  What's more,

I think that that assertion was proven incorrect about 10-15 years ago.

|>relational technology is solving the problems of distributed
|>applications better than the older architectures, at transaction speeds
|>that are so far adequate for most applications.  A recent Digital
|>Review survey asked relational users about their throughput
|>requirements.  They found that about 90% of applications required no
|>more than about 12TPS.  This TPS number will surely increase, as will
|>the ability of relational systems to carry more load.

The figure 12TPS by itself is meaningless.  How many users, what
architecture, etc. should accompany any figures.  Sybase claims 34 TPS
for 30(?) users on a Sun-4.  What do other vendors claim?

|>
|>Organizational reluctance to replace existing non-relational systems
|>is now very understandable.  Replacements will occur as the economic

RDBMSs have their benefits.  Non-RDBMSs have their benefits.  Though it
is true that RDBMSs are not as slow as anti-RDBMSers (of the great
debate at SIGMOD in the 70's) claimed them to be, they still do not
match the performance of navigational systems like IMS.  One of the
reasons that many corporations have not moved from IMS to relational
systems is for this exact reason.  12TPS may be acceptable to relational
users, but it surely isn't for IMS users.

|>benefits of the distributed high-performance relational model increasingly
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Where?  Commercial RDBMS vendors claim high-performance.  What they are
really claiming is that their RDBMS performs faster than the
competition.  1000TPS is high-performance.  12TPS (or 34) is acceptable.

|>outweigh the costs of changing.  Actual replacements will be preceded
|>by gradual integration of RDBMS into the organization's DP framework.
|>It is important for relational products to allow connection with
|>existing heterogenous systems, rather than requiring their replacement.

As I stated earlier, the two major contributions of the relational model
have been non-procedural access and data independence.  However, the
implementation to provide these features will incur overhead that
navigational systems (like hierarchical and network) do not have to pay
for.  For instance, joins are a real big performance killer for
relational systems.  So there is some substance behind the users
associating relational "... products with poor system performance, even
though they may be flexible and easier to implement."[from the original
posting on the British report]

As Tom [the original poster] suggested, let's not start a war telling
each other how wonderful relational technology is.

But to answer Tom's question on whether "relational" has to mean "overhead":
Relational does not mean overhead, but since it provides more "features"
(flexible, easier to implement, easier to use(?)), some overhead *must*
be incurred.

I think a good discussion would be over where the overheads are.  For
starters, relational query compilation has to be smarter.  But they may
not (never?) be smart enough!?!

Flames to /dev/null
Disussion to comp.databases

*swfc