Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!mit-eddie!think!ames!ucbcad!zen!ingres.Berkeley.EDU!larry From: larry@ingres.Berkeley.EDU (Larry Rowe) Newsgroups: comp.databases Subject: Re: Database Machines Message-ID: <2891@zen.berkeley.edu> Date: Wed, 17-Jun-87 12:19:02 EDT Article-I.D.: zen.2891 Posted: Wed Jun 17 12:19:02 1987 Date-Received: Sun, 21-Jun-87 09:59:19 EDT References: <2700@blia.BLI.COM> <851@rtech.UUCP> <2042@utah-gr.UUCP> <863@rtech.UUCP> <111@blic.BLI.COM> Sender: news@zen.berkeley.edu Reply-To: larry@ingres.Berkeley.EDU.UUCP (Larry Rowe) Organization: University of California, Berkeley Lines: 88 Several comments on the recent discussion of ``database machines.'' 1. I too am skeptical that custom hardware can be made price/performance competitive with software database systems. While I agree with the folks from Britton-Lee that they can use new technology to build the next generation hardware sooner than a vanilla hardware vendor, I don't think they can sell enough boxes to make a very profitable business. Britton-Lee has had a rough time the past 12-18 months because they are selling products based on 2-5 year old technology (Z8000 + custom processor). The rumors about their new machine are that it is a tightly-coupled, shared memory processor. You can buy the same hardware from a vanilla vendor today and run a software DBMS on it. Examples are: stratus, sequent, encore, mips, etc. The software DBMS's will do the same thing that Britton-Lee will do in terms of shared memory buffer managers, etc so the solutions will be roughly the same. (Of course, at any given time one vendor's product will be ahead or behind another vendor's -- Britton-Lee has done a good job delivering DBMS software.) Now here's the rub. The vanilla hardware vendors will sell several thousand of their boxes. When DEC delivers their tightly-coupled, shared memory processor, they will deliver tens of thousands. Britton-Lee will be lucky to sell a thousand. The vanilla vendors will have more sales over which to amortize their costs. They will drive the cost down on the boxes as they compete and Britton-Lee will have a harder time maintaining their margins. The key advantage that Britton-Lee has is their software. The software DBMS vendors have not directly attacked this market (e.g., by oem'ing hardware and doing more software customization) because the market for sales is much, much greater in the ``run everywhere'' and distributed, heterogeneous DBMS markets. They have 50-100 man-years of development to do to be competitve in that market. The dbmachine market is too small. 2. The above analysis says that Sybase has a creditable strategy because they are doing software customization on a few machines. A good example is their Unix kernel mods for the Sun. They deliver improved performance at a specific cost -- running a nonstandard OS. It remains to be seen if Sybase can deliver a robust system that matches the advertised performance claims in a production environment. Also, they will be pressured into ``running everywhere'' (they've announced a VAX product and their recent venture with Microsoft suggests a lot of work on PC hardware) and they will quickly fall into the morass of customizing the code for N environments (e.g., do you run on the VAX cluster yet, how's your MAP network protocol support, ...). 3. Teradata sells custom hardware and software. However, it is my opinion that 90% of their advantage comes from the fact they are running a distributed DBMS on multiple machines. The only novel feature of their architecture is the Y-Net that they claim gives them big performance improvements. I'd really love to see a benchmark with identical hardware/software except a different network. One advantage that I can see to the Y-Net is the parallel sorting capability. However, Jim Gray wrote a parallel sort package at Tandem that speeded-up sorting to roughly twice the time to read the data (i.e., you must read and write the data at least once -- sorting is overlapped completely with it). So, how does a software parallel sort compare to the Y-Net. Another possible advantage of the Y-Net is response-time and throughput during heavy loads. A loaded LAN can be clogged when many messages are clogging the net. Remembe that distributed DBMS's will ship a lot of data around to answer adhoc queries. Another interesting experiment, how fast an ethernet or token-ring is needed to achieve the same performance. 4. The Tandem high transaction rates come from a vanilla distributed relational DBMS running on vanilla hardware. The big difference is that they have spent 10 years optimizing their storage system, buffer manager, logging system, etc. 5. Another thing. When doing performance comparisons, it is important to compare apples and apples. Numbers ought to be $'s/xact (guess what, a program on an ibm 3094 is faster than a vax!) and/or use identical hardware (2 processors are better than 1). i'm tired of seeing claims that a dbmachine is faster than a loaded central machine. of course it is, the central machine has other things to do. compare the performance/cost to buying a larger central machine or buying a second general purpose processor. ------- Bottom line: 1. Hardware is nice but software is cheaper and probably faster. (Larry's Lament: Hardware companies build bigger valuations faster because of the size of the business (i.e., more revenues and more expenses). Software companies ought to produce higher higher profits....) 2. Distributed DBMS's are a big, big win. Every DBMS vendor better have one by 1990 or they be seriously disadvantaged in the marketplace. So, where are all these vendors going to find captial to fund a 20 man-year project to build a distributed DBMS? 3. Benchmark wars will continue to be fought and they might tell you something, and then again, they might lie.