Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!sri-spam!mordor!lll-tis!ptsfa!hippo!eric From: eric@hippo.UUCP (Eric Bergan) Newsgroups: comp.databases Subject: Re: Database Machines Message-ID: <131@hippo.UUCP> Date: Wed, 24-Jun-87 12:04:10 EDT Article-I.D.: hippo.131 Posted: Wed Jun 24 12:04:10 1987 Date-Received: Fri, 26-Jun-87 05:13:49 EDT References: <2861@blia.BLI.COM> <2918@zen.berkeley.edu> Organization: HEALTHCARE 2000 Lines: 95 Keywords: servers vs. distrib is more interesting... In article <2918@zen.berkeley.edu>, larry@ingres.Berkeley.EDU (Larry Rowe) writes: > In article <2861@blia.BLI.COM> billc@blia.BLI.COM (Bill Coffin) writes: > >>From larry@ingres.Berkeley.EDU (Larry Rowe) Wed Jun 17 09:19:02 1987 > >>2. Distributed DBMS's are a big, big win. [ ... ] > > > >Why? I'm convinced that many of the people who THINK they want > >distributed DBMS's REALLY need server architectures. See Jim Gray's > >article in the May UNIX REVIEW. I won't elaborate here -- but > >I would like to see a distribution vs. server discussion on the net. > > i haven't read jim's article, but knowing him and having read some tandem > tech reports on the topic, i think i have some additional insights. first, > application design for distributed applications is very, very hard. average > people don't have the experience and the vendors products do not offer enough > help yet to make it easy to define them. consequently, only pioneers and > very brave people will attempt to build them. btw, tandem has only recently > come out with an SQL interface to its distributed dbms offering. i'll be > curious to see how much usage goes up now that end-users and mere humans > can access the distributed databases. > > second, tandem's distributed dbms is a single-vendor hardware solution. > when i visit companies and universities, senior managers say their number > 1 problem is managing the diversity of hardware/software that proliferates > through the organization. a distributed heterogenous dbms can cover up > this diversity and give people control again of their corporate data. the > complete solution to this will take years to achieve, but from my discussions, > people really want it. also, some new application growth areas (e.g., > factory autmation) insist on distributed dbms's. so, i stand by my statement. I think for this argument (and particularly for Gray's papers, both in Unix Review, and also in the June, 1986 issue of IEEE Transactions on Software Engineering), it is important to distinguish two very different uses of relational databases. The first is what most of the database products have been initially used for - ad hoc queries against a database, where the number of queries far exceeds the number of updates. Typically such applications are characterized by a relatively low number of transactions per second, but the transactions themselves are probably more complex - joins, aggregates, etc. The second is a much more transaction-oriented system, the classic case being airline reservations. Here, transactions tend to be simple, but the transaction rate is much higher. I think Gray's comments are much more addressed to the transaction oriented distributed applications. In a transaction oriented system, transparency is much less important. At the time the application is written, the queries are determined, and it is possible to map out what servers have the data that is needed for a given transaction. There are almost no "ad hoc" queries which would require some kind of distributed optimizer to sort out at query time. Gray's point (which I think Larry is talking about), is that there are enough other headaches in a transaction oriented system just with the networking, and understanding the design of the application, without having to worry about how efficiently the database system decides to process the query. This is especially true in hooking up heterogeneous machines and databases. The chance of having an MVS VSAM file become a "transparent" part of a distributed relational database are pretty small. But it is feasible to hook up a server to it, that can participate in a distributed requester/server model. This does, of course, have the problem of having to change the application(s) if you decide to move the data partitioning around. I like the Sybase approach to this - namely the transactions are stored in the database itself, rather than in the applications. While you still have to change the transactions if you change the schema, at least they are all in one place, and the applications themselves do not have to be rebuilt. I think the real challenge for the database vendors will be how to interact with other database products - primarily non-relational ones. Surprisingly many of the corporate databases are under VSAM files or the like - very few are relational. Given that, trying to force relational semantics on these databases is going to be very difficult. While I believe that it is likely that someday, most of these corporate databases will convert to relational databases, I think that the transition time will be at least 10 years, maybe longer. The transition will happen as applications are replaced - not because they are converted. One final point in the distributed vs. single server discussion. Very few applications live in a vacuum (or if they do, it was because they were forced to). Almost all of them would like to be able to share data with other related applications that already exist, have their own database systems in place, and either work perfectly well, or would be too expensive to convert to something else. A single server model does not seem able to handle the economics (and sometimes politics) of such a case. A distributed system (not necessarily "transparent") does allow the new applications to share the data with a minimum (no?) impact on the existing applications. Bill - do you envision a single server approach also being desirable in the case of a geographically distributed system, where the sites are primarily autonomous, but some data replication and cross site queries are good? How would you design such a system? -- eric ...!ptsfa!hippo!eric