Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!uakari.primate.wisc.edu!ames!ames.arc.nasa.gov!lamaster
From: lamaster@ames.arc.nasa.gov (Hugh LaMaster)
Newsgroups: comp.arch
Subject: Re: DataBase Machines
Keywords: database machines
Message-ID: <43855@ames.arc.nasa.gov>
Date: 28 Feb 90 01:52:38 GMT
Sender: usenet@ames.arc.nasa.gov
Organization: NASA - Ames Research Center
Lines: 59
References:

In reviewing my (old) files on DataBase machines (e.g. ShareBase == Britton Lee)
it appeared that the main "problem" that a DBM is supposed to address is
the limitation of main memory bandwidth.  Specifically, if processing
can be done "in the channel" rather than on the main processor(s), then
only data which will be used later will be written to memory.  Also, some
level of parallelization can take place, and cycle consuming operations that
can be done in parallel can be offloaded.  Presumably, this
allows cheaper machines with limited memory bandwidth to avoid wasting
that bandwidth on reading data into memory which will be filtered out at
the first pass.  Or, which is wasted on "trivial but expensive" operations
such as compression.  Presumably the result is cheaper parallelism.

Now, this is an issue which reappears in many forms:

Specialized network processors,
Specialized graphics processors, and,
Specialized DataBase processors.

The question with any such processor is whether it provides a big
enough performance boost to keep the product ahead of general purpose
machines, which are usually on a much faster design cycle.


I have several questions:

1)	Just what kind of performance, by various appropriate measures, do the 
	current crop of DBMs provide (BTW - I notice that there are now
	2 measures of standard TPS units - any enlightenment and/or
	correspondence between the two appreciated) vs standard architecture
	machines.  Does anyone know what kind of performance IBM gets out
	of ACP and RDBMSs, say, on its biggest iron?  How about smaller
	machines.  Sun was quoting pretty high numbers, by one of the TPS
	measures, on the 4/490, for a bus based workstation server.  
	How about other, more complex operations?

2)  Has anyone considered an extended filesystem approach for Unix,
	wherein specialized DataBase operations are supported in
	the filesystem, and, a specialized bus-based (e.g. VME, FutureBus)
	processor is attached to the system.  This would appear to be more
	flexible and allow many companies to access the functionality, the
	same way that they do with new disk controllers, etc.  What primitive
	operations should be supported: powerful enough to reduce traffic
	to memory by a large fraction, general enough to use with a wide
	variety of database systems?  What I envision is a definition for a
	dbfs, database filesystem, which would support various RDBMSs such as 
	Oracle and Sybase via multiple dbfs's.  The parallelism would have 
	to be across controller boards/filesystems, with processors in the
	controllers for the operations.  You might support better keyed access,
	compression, lock support (what kind?), disk optimizations (the usual),
	and any other useful parallelizable operations.  In configuration terms,
	you might have 4 controller boards on your VME (say) system, and get 
	some fraction of parallel performance across the boards.

Comments?

  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)604-6117