Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!uunet!decwrl!ucbvax!mtxinu!sybase!ohday!tim From: tim@ohday.sybase.com (Tim Wood) Newsgroups: comp.databases Subject: Re: Is RDBMS unproven technology? (Flames to follow....) Message-ID: <10419@sybase.sybase.com> Date: 6 Aug 90 00:12:18 GMT References: <1073@ashton.UUCP> <10371@sybase.sybase.com> <13532@ulysses.att.com> Sender: news@Sybase.COM Organization: Sybase, Inc. Lines: 174 In article <13532@ulysses.att.com> swfc@ulysses.att.com (Shu-Wie F Chen) writes: >In article <10371@sybase.sybase.com>, tim@ohday.sybase.com (Tim Wood) writes: >|> >|>Relational systems have so far been deployed in smaller-scale >|>applications than have hierarchical and network systems. > >I don't see ... why >relational systems have so far only been deployed in smaller-scale >applications. What I'm driving at is that relational has not developed in a DP/MIS context, and DP/MIS is where most of the large-scale business applications have traditionally resided. Relational is the architecture of choice for the "bottom-up" development of organizational databases, where local DP departments are creating relational databases to manager their local operations, and looking for ways to tie all those local databases together. >|>The appeal of relational systems has been the promise of flexible >|>access to the database by users far removed from the DP department. > >RDBMSs have made two contributions: >1. non-procedural access >2. data independence True, but most users running canned applications won't be as aware of these features as applications programmers, who beenfit most from them. I was really only discussing the end-users, since they are the largest group of database utilizers in an organization. Your comment is correct and rounds out my point. >I don't see what relational systems have to do with "the promise of >flexible access ... far removed from the DP department." Are you >implying that network communication or client/server is restricted to >relational systems? No, but relational seems to be the context in which client/server is most rapidly being deployed. I do think it's easier to distribute a relational db than a naviagational (thanks, good adjective) one, because of the looser coupling among data objects. >|>The trend toward decentralized access has been strengthened by the >|>growth of processing power directly available to individual users, and >|>by the changing nature of applications themselves. > >Hmmm. Last week I spoke with a Sybase tech support person who said that >Sybase's client/server architecture was geared toward having most of the >computation performed at the server end. My response was "How about all >that CPU power directly available to the user?" Those MIPS are used for the applications. That local power makes it economical to perform complicated analysis and transformations on the data. Basically, the server preserves and disseminates existing knowledge, but new knowledge is created on the front-end. The front-end then submits that new knowledge to the server, which may reject it because the knowledge does not fit the world model known to the server (in slogan-speak, this is "DBMS enforced integrity"). Or the server accepts it, and the whole organization becomes "smarter." This is still a relatively new concept feature of products, an improvement over the case where each application has to apply the model. >It seems that Sybase >feels that database computation should not be done at the client end... >They believe they can overcome the CPU bottleneck at the server end... At this point, "database computation" is too vague a term to allow a response. >|>Relational systems lend themselves well to distributed database, where >|>by definition there will be fewer, if any, centralized [servers] > >Huh? What definition? If a database is distributed, then the database state is maintained by more than one server. The limiting case is where every machine on the network is of similar size and maintains an equal part of the database. A more likely scenario is a server hierarchy, such as in telephone exchanges. >I think relational systems lend themselves well to distributed databases >because they are set-oriented, rather than navigational systems like the >hierarchical and network models. That's what I was driving at. Thanks for the words beyond "so many words". >|>... the aggregate throughput of the networked database can be prodigious. > >Is this an argument for throughput over response time? From the user's >point of view, it is much easier to gauge response time. Distributed balances both. It's analogous to caching, or virtual memory in that you have a small, frequently-used subset of the database that is local, so average response times are close to what they would be if the entire database was local on a behemoth machine. Yet the whole database might be so large that it would take a buildingful of 3090's (or clones :-) to hold it all locally. I am speaking in generality here, much design and measurement must go into one's distributed db schema so that average response time is good and worst case not awful. Probably beyond the state of the practice today. Maybe one reason why distributed is slow catching on. >|>... Today's technology is proving (already has, actually) that the >|>assertion that relational is slow is out-of-date. What's more, > >I think that that assertion was proven incorrect about 10-15 years ago. It was proven that RDBMS COULD be as fast as existing navigational systems, but there haven't been competitive products till recently. "Proof" for many folks requires no less than a released (or announced :-) product. >|>... A recent Digital >|>Review survey asked relational users about their throughput >|>requirements. They found that about 90% of applications required no >|>more than about 12TPS. > >The figure 12TPS by itself is meaningless. How many users, what >architecture, etc. should accompany any figures. Sybase claims 34 TPS >for 30(?) users on a Sun-4. What do other vendors claim? Twelve TPS as measured at the server. So as you pile on users, response time will tank (ie go up). I think the survey intended an implicit clause, "with acceptable user response time." >.... One of the >reasons that many corporations have not moved from IMS to relational >systems is [unacceptable performance]. 12TPS may be acceptable to relational >users, but it surely isn't for IMS users. 'Cuss not. For large DP hardware, you'd better be talking well into the 100's. To handle volume, an RDBMS product has to scale with hardware. >... 1000TPS is high-performance. 12TPS (or 34) is acceptable. Show me someone getting 1000TPS on a Sun 3/280. What must not exist in a product is a performance ceiling above which throughput stops growing (linerarly) with increase in platform scale. That's the essence of the perceived "relational bug." >... [J]oins are a real big performance killer for relational systems. Not if they are pre-optimized or pre-computed. >So there is some substance behind the users >associating relational "... products with poor system performance, even >though they may be flexible and easier to implement."[from the original >posting on the British report] Sure, the substance is based on historical knowledge. That knowledge is being obsoleted by the onset of RDBMSs that scale well. I believe users will be able to have both DP-scale performance and ease of use in RDBMS in the near future. >But to answer Tom's question on whether "relational" has to mean "overhead": >Relational does not mean overhead, but since it provides more "features" >(flexible, easier to implement, easier to use(?)), some overhead *must* >be incurred. The question is where to place that overhead. That's one problem we (Sybase anyway) are trying to solve. >I think a good discussion would be over where the overheads are. For >starters, relational query compilation has to be smarter. Hmm, I've been developing the opinion that query compilation is a largely solved problem (cost-based optimizers, etc.), but that fundamental things like I/O management and access methods policies need a lot more work in RDBMS. So sounds like we have a good discussion ahead of us :-) . -TW --- Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608 415-596-3500 tim@sybase.com {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim One day, when I can afford enough lawyers, I will speak for a whole company. For now, I speak just for myself.