Path: utzoo!attcan!uunet!tellab5!segel
From: segel@tellabs.com (Mike Segel)
Newsgroups: comp.databases
Subject: Re: Relational Database, with a Graphical type field
Message-ID: <2952@tellab5.tellabs.com>
Date: 8 Jul 90 17:26:39 GMT
References: <6207@tekgen.BV.TEK.COM> <2895@tellab5.tellabs.com> <913@dgis.dtic.dla.mil>
Sender: news@Tellabs.COM
Organization: Tellabs, Inc. Lisle IL
Lines: 105


	[As a point of clarification, I am talking about DB internal mechanisms
	 for handling Blobs. Anyone confused by this discusion should sit back
	 and think for a while. Please do not use my discussion as an opinion on
	 the performance of any database or database company. HAPPY DAVE? ;-]

In article <913@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:
>segel@tellabs.com (Mike Segel) writes:
>
>>Well, the problem arises when you now have each tuple being 1 - 2 Meg
>>in size. So now how do you efficiently sort on non graphic data?
>
>By using appropriate storage structures, access methods, sort
>algorithms, query decomposition (might reduce rows to be sorted based
>on other query terms, even eliminate the need for sort if 0 or 1
>returned row).
>
	Jon, you are missing the point. By keeping the Blob as part of the tuple,
	you have now a tuple of 2Meg (+- rest of tuple) in width. So each of
	your records are now 2Meg in width. How many and how efficiently can
	you sort/querry ect. on a tuple of this size. How much memory
	is now required. (How much swap?).  Lets say you build up a cursor
	and it contains 3 rows. That could be around 6 Meg of memory/swap.
	Not to effiecient since 2 Meg of that data is not required for any
	join or relational function. (Yet...)

	As well as the fact that not all tuples will have a blob attached, but will
	have to have space allocated for a blob. 

	All of these problems are reduced when you just have a pointer as part of
	the tuple. The pointer can point to the Blob storage area of a raw disk
	(Informix Online), or to a file or directory in Unix. 

	At the end of last year, Silicon Graphics announced a database product
	which allowed for the storage of Blobs based on Informix's Standard Engine.
	How could they do this? Having not seen the package, or source code, 
	I can only conjecture on the following....

		The front end I belive was in X, or some other windowing environment.
		So it could have been written in ESQL/C. The storage of the blobs
		could be that they store all the blobs as a single file, or a directory
		of multiple files. (I would think that multiple files is better.)
		Then the only question would be, How do they perform locking? This is 
		fairly straightforward and has been published in several books.
		Then the tuple need only contain an FD to the graphic blob. Of course 
		there are some other potential problems which are irrelevant to this
		discussion.

	The point is, they (SG) and Informix are providing the ability of ADT's
	by allowing for Blobs. I think back that the discussion evolved from 
	trying to allow for ADT's like graphical images, sounds, text, or various
	other fields. This can all be accomplished withing the relational model.

>
>Neither true nor false.  Which usually indicates you aren't honing
>in on the right issues.  "ADT's don't kill performance, people
>who don't know how to use them kill performance"  :-)
>
	Yeah. It's like tuning the back-end to gain performance when a series
	of code reviews and a rethink of the specs would do more good ;-)

>>I prefer to take the minimalist approach in designing
>>back-ends. The less intelligent the backend, the greater ability 
>>to treat data in an abstract fashion.
>
>What would be an example of this?  To weigh against all the
>counter-examples.
>
	Simple. Take the idea of a blob. In informix, it is a byte stream
	of up to 2 Meg. Now, with this simple type, you can now allow for 
	a database to contain Images, voice/sound, or any other data which
	is in its simplest form, digital information.
		
	Now informix also allows for a varchar and I belive an another type
	of stream. (I need to go back and check so don't flame me if I am wrong.)
	What they have done, is to have the back end define the basic building
	blocks which will allow for other ADT's.

		This is great for certain applications. One example, is the real estate
	demo, informix uses for online. They have this demo done twice. Once using
	Sunview on a Sun workstation and a CD Rom device, and the other on a Mac.
	running Wingz. (Another fine product from Informix ;-) Now, both show you
	a raster of the house, and different views. How is it stored? How can you
	take a raster/gif/ picture which is required for two or three diferent 
	machines, and store it in the DB? You could create an ADT for each 
	raster/image format, but that means storing the photograph in the DB
	several times. Or you could separate the header information from the blob,
	then have the front end application, based on the machine, reasemble the
	image in the correct format and the header information. 
	
	So now your front-end application needs to be a little smarter, yet your
	back-end is capable of supporting various front ends without having to 
	be modified. My point is that the DB backend should be storing the data 
	in its simplest components rather than trying to handle data in its more
	complex forms.

>-- Jon

- Mike (" I am no expert. Noone pays me for my opinions" ;-)

--
Mike Segel         | uunet!balr.com                    | Std.disclaimer 
BALR Corporation   | segel@quanta.eng.ohio-state.edu   | implied and 
Oakbrook, Illinios | uunet!tellabs.com!segel           | understood
-------------------^-----------------------------------^----------------