Path: utzoo!attcan!uunet!tellab5!segel From: segel@tellabs.com (Mike Segel) Newsgroups: comp.databases Subject: Re: Relational Database, with a Graphical type field Message-ID: <2952@tellab5.tellabs.com> Date: 8 Jul 90 17:26:39 GMT References: <6207@tekgen.BV.TEK.COM> <2895@tellab5.tellabs.com> <913@dgis.dtic.dla.mil> Sender: news@Tellabs.COM Organization: Tellabs, Inc. Lisle IL Lines: 105 [As a point of clarification, I am talking about DB internal mechanisms for handling Blobs. Anyone confused by this discusion should sit back and think for a while. Please do not use my discussion as an opinion on the performance of any database or database company. HAPPY DAVE? ;-] In article <913@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes: >segel@tellabs.com (Mike Segel) writes: > >>Well, the problem arises when you now have each tuple being 1 - 2 Meg >>in size. So now how do you efficiently sort on non graphic data? > >By using appropriate storage structures, access methods, sort >algorithms, query decomposition (might reduce rows to be sorted based >on other query terms, even eliminate the need for sort if 0 or 1 >returned row). > Jon, you are missing the point. By keeping the Blob as part of the tuple, you have now a tuple of 2Meg (+- rest of tuple) in width. So each of your records are now 2Meg in width. How many and how efficiently can you sort/querry ect. on a tuple of this size. How much memory is now required. (How much swap?). Lets say you build up a cursor and it contains 3 rows. That could be around 6 Meg of memory/swap. Not to effiecient since 2 Meg of that data is not required for any join or relational function. (Yet...) As well as the fact that not all tuples will have a blob attached, but will have to have space allocated for a blob. All of these problems are reduced when you just have a pointer as part of the tuple. The pointer can point to the Blob storage area of a raw disk (Informix Online), or to a file or directory in Unix. At the end of last year, Silicon Graphics announced a database product which allowed for the storage of Blobs based on Informix's Standard Engine. How could they do this? Having not seen the package, or source code, I can only conjecture on the following.... The front end I belive was in X, or some other windowing environment. So it could have been written in ESQL/C. The storage of the blobs could be that they store all the blobs as a single file, or a directory of multiple files. (I would think that multiple files is better.) Then the only question would be, How do they perform locking? This is fairly straightforward and has been published in several books. Then the tuple need only contain an FD to the graphic blob. Of course there are some other potential problems which are irrelevant to this discussion. The point is, they (SG) and Informix are providing the ability of ADT's by allowing for Blobs. I think back that the discussion evolved from trying to allow for ADT's like graphical images, sounds, text, or various other fields. This can all be accomplished withing the relational model. > >Neither true nor false. Which usually indicates you aren't honing >in on the right issues. "ADT's don't kill performance, people >who don't know how to use them kill performance" :-) > Yeah. It's like tuning the back-end to gain performance when a series of code reviews and a rethink of the specs would do more good ;-) >>I prefer to take the minimalist approach in designing >>back-ends. The less intelligent the backend, the greater ability >>to treat data in an abstract fashion. > >What would be an example of this? To weigh against all the >counter-examples. > Simple. Take the idea of a blob. In informix, it is a byte stream of up to 2 Meg. Now, with this simple type, you can now allow for a database to contain Images, voice/sound, or any other data which is in its simplest form, digital information. Now informix also allows for a varchar and I belive an another type of stream. (I need to go back and check so don't flame me if I am wrong.) What they have done, is to have the back end define the basic building blocks which will allow for other ADT's. This is great for certain applications. One example, is the real estate demo, informix uses for online. They have this demo done twice. Once using Sunview on a Sun workstation and a CD Rom device, and the other on a Mac. running Wingz. (Another fine product from Informix ;-) Now, both show you a raster of the house, and different views. How is it stored? How can you take a raster/gif/ picture which is required for two or three diferent machines, and store it in the DB? You could create an ADT for each raster/image format, but that means storing the photograph in the DB several times. Or you could separate the header information from the blob, then have the front end application, based on the machine, reasemble the image in the correct format and the header information. So now your front-end application needs to be a little smarter, yet your back-end is capable of supporting various front ends without having to be modified. My point is that the DB backend should be storing the data in its simplest components rather than trying to handle data in its more complex forms. >-- Jon - Mike (" I am no expert. Noone pays me for my opinions" ;-) -- Mike Segel | uunet!balr.com | Std.disclaimer BALR Corporation | segel@quanta.eng.ohio-state.edu | implied and Oakbrook, Illinios | uunet!tellabs.com!segel | understood -------------------^-----------------------------------^----------------