Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!decwrl!ucbvax!ulysses!swfc From: swfc@ulysses.att.com (Shu-Wie F Chen) Newsgroups: comp.databases Subject: Re: Is RDBMS unproven technology? (Flames to follow....) Message-ID: <13532@ulysses.att.com> Date: 3 Aug 90 20:06:22 GMT References: <1073@ashton.UUCP> <10371@sybase.sybase.com> Sender: netnews@ulysses.att.com Reply-To: swfc@ulysses.att.com (Shu-Wie F Chen) Organization: AT&T Bell Labs Lines: 153 In article <10371@sybase.sybase.com>, tim@ohday.sybase.com (Tim Wood) writes: |>In article <1073@ashton.UUCP> tomr@ashton.UUCP (Tom Rombouts) writes: |>>Are relational databases an unproven technology regarding |>>performance? |>> |>>....A key tenet of the report is that RDBMS technology has been |>>available for 20 years but still has not been proved in large, |>>complex applications. The report notes that users associate |>>these products with poor system performance, even though they may |>>be flexible and easier to implement. The article then goes on |>>to cite a firm that is reluctant to replace IMS with DB2, and |>>discusses other sites that use a mixture of relational and |>>possibly non-relational systems. |> [some deleted stuff] |> |>Relational systems have so far been deployed in smaller-scale |>applications than have hierarchical and network systems. This is due |>to several factors: relational is "newer" (that is, the technology |>existed long before successful commercial products) and the older |>database architectures were deployed in the days when nearly all |>commercial computing resources were centralized and operating in a |>batch-processing environment. In that environment, updates and access |>to the database are relatively rigidly controlled. I don't see how these reasons (which are not incorrect) explain why relational systems have so far only been deployed in smaller-scale applications. |> |>The appeal of relational systems has been the promise of flexible |>access to the database by users far removed from the DP department. RDBMSs have made two contributions: 1. non-procedural access 2. data independence I don't see what relational systems have to do with "the promise of flexible access ... far removed from the DP department." Are you implying that network communication or client/server is restricted to relational systems? |>The trend toward decentralized access has been strengthened by the |>growth of processing power directly available to individual users, and |>by the changing nature of applications themselves. Hmmm. Last week I spoke with a Sybase tech support person who said that Sybase's client/server architecture was geared toward having most of the computation performed at the server end. My response was "How about all that CPU power directly available to the user?" It seems that Sybase feels that database computation should not be done at the client end(I read this as personal workstation) because it would take away CPU cycles for editing, reading news, etc. They believe they can overcome the CPU bottleneck at the server end. This seems to contradict the above statement by Tim (who works for Sybase). [sorry for this digression, but Tim's position (which I agree with) seems to differ from that of his company's] |>Relational systems lend themselves well to distributed database, where |>by definition there will be fewer, if any, centralized points of ^^^^^^^^^^^^^ Huh? What definition? I think relational systems lend themselves well to distributed databases because they are set-oriented, rather than navigational systems like the hierarchical and network models. You can think in terms of sets of tuples coming from each site instead of thinking on the level of individual records. |>transaction processing activity. So, an individual site in the |>distributed relational database may look "slow" compared to IMS on an |>IBM-MVS 3090, but the aggregate throughput of the networked database |>can be prodigious. Is this an argument for throughput over response time? From the user's point of view, it is much easier to gauge response time. |> |>This is not to say that there is some theoretical limit to the |>performance of individual relational database engines. Indeed, a major |>focus of the industry now is to develop local transaction processing |>speeds that rival those of older architectures on a platform of similar |>scale. Today's technology is proving (already has, actually) that the |>assertion that relational is slow is out-of-date. What's more, I think that that assertion was proven incorrect about 10-15 years ago. |>relational technology is solving the problems of distributed |>applications better than the older architectures, at transaction speeds |>that are so far adequate for most applications. A recent Digital |>Review survey asked relational users about their throughput |>requirements. They found that about 90% of applications required no |>more than about 12TPS. This TPS number will surely increase, as will |>the ability of relational systems to carry more load. The figure 12TPS by itself is meaningless. How many users, what architecture, etc. should accompany any figures. Sybase claims 34 TPS for 30(?) users on a Sun-4. What do other vendors claim? |> |>Organizational reluctance to replace existing non-relational systems |>is now very understandable. Replacements will occur as the economic RDBMSs have their benefits. Non-RDBMSs have their benefits. Though it is true that RDBMSs are not as slow as anti-RDBMSers (of the great debate at SIGMOD in the 70's) claimed them to be, they still do not match the performance of navigational systems like IMS. One of the reasons that many corporations have not moved from IMS to relational systems is for this exact reason. 12TPS may be acceptable to relational users, but it surely isn't for IMS users. |>benefits of the distributed high-performance relational model increasingly ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Where? Commercial RDBMS vendors claim high-performance. What they are really claiming is that their RDBMS performs faster than the competition. 1000TPS is high-performance. 12TPS (or 34) is acceptable. |>outweigh the costs of changing. Actual replacements will be preceded |>by gradual integration of RDBMS into the organization's DP framework. |>It is important for relational products to allow connection with |>existing heterogenous systems, rather than requiring their replacement. As I stated earlier, the two major contributions of the relational model have been non-procedural access and data independence. However, the implementation to provide these features will incur overhead that navigational systems (like hierarchical and network) do not have to pay for. For instance, joins are a real big performance killer for relational systems. So there is some substance behind the users associating relational "... products with poor system performance, even though they may be flexible and easier to implement."[from the original posting on the British report] As Tom [the original poster] suggested, let's not start a war telling each other how wonderful relational technology is. But to answer Tom's question on whether "relational" has to mean "overhead": Relational does not mean overhead, but since it provides more "features" (flexible, easier to implement, easier to use(?)), some overhead *must* be incurred. I think a good discussion would be over where the overheads are. For starters, relational query compilation has to be smarter. But they may not (never?) be smart enough!?! Flames to /dev/null Disussion to comp.databases *swfc