Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!uakari.primate.wisc.edu!ames!ames.arc.nasa.gov!lamaster From: lamaster@ames.arc.nasa.gov (Hugh LaMaster) Newsgroups: comp.arch Subject: Re: DataBase Machines Keywords: database machines Message-ID: <43855@ames.arc.nasa.gov> Date: 28 Feb 90 01:52:38 GMT Sender: usenet@ames.arc.nasa.gov Organization: NASA - Ames Research Center Lines: 59 References: In reviewing my (old) files on DataBase machines (e.g. ShareBase == Britton Lee) it appeared that the main "problem" that a DBM is supposed to address is the limitation of main memory bandwidth. Specifically, if processing can be done "in the channel" rather than on the main processor(s), then only data which will be used later will be written to memory. Also, some level of parallelization can take place, and cycle consuming operations that can be done in parallel can be offloaded. Presumably, this allows cheaper machines with limited memory bandwidth to avoid wasting that bandwidth on reading data into memory which will be filtered out at the first pass. Or, which is wasted on "trivial but expensive" operations such as compression. Presumably the result is cheaper parallelism. Now, this is an issue which reappears in many forms: Specialized network processors, Specialized graphics processors, and, Specialized DataBase processors. The question with any such processor is whether it provides a big enough performance boost to keep the product ahead of general purpose machines, which are usually on a much faster design cycle. I have several questions: 1) Just what kind of performance, by various appropriate measures, do the current crop of DBMs provide (BTW - I notice that there are now 2 measures of standard TPS units - any enlightenment and/or correspondence between the two appreciated) vs standard architecture machines. Does anyone know what kind of performance IBM gets out of ACP and RDBMSs, say, on its biggest iron? How about smaller machines. Sun was quoting pretty high numbers, by one of the TPS measures, on the 4/490, for a bus based workstation server. How about other, more complex operations? 2) Has anyone considered an extended filesystem approach for Unix, wherein specialized DataBase operations are supported in the filesystem, and, a specialized bus-based (e.g. VME, FutureBus) processor is attached to the system. This would appear to be more flexible and allow many companies to access the functionality, the same way that they do with new disk controllers, etc. What primitive operations should be supported: powerful enough to reduce traffic to memory by a large fraction, general enough to use with a wide variety of database systems? What I envision is a definition for a dbfs, database filesystem, which would support various RDBMSs such as Oracle and Sybase via multiple dbfs's. The parallelism would have to be across controller boards/filesystems, with processors in the controllers for the operations. You might support better keyed access, compression, lock support (what kind?), disk optimizations (the usual), and any other useful parallelizable operations. In configuration terms, you might have 4 controller boards on your VME (say) system, and get some fraction of parallel performance across the boards. Comments? Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)604-6117