Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!amdahl!pacbell!hoptoad!unisoft!mtxinu!sybase!linus!tim
From: tim@linus.sybase.com (Tim Wood)
Newsgroups: comp.databases
Subject: Re: Databases: separate-file vs. monolithic file structure
Message-ID: <976@sybase.sybase.com>
Date: 10 Sep 88 00:16:17 GMT
References: <6299@dasys1.UUCP>
Sender: news@sybase.sybase.com
Reply-To: tim@linus.UUCP (Tim Wood)
Organization: Sybase, Inc.
Lines: 151

In article <6299@dasys1.UUCP> alexis@dasys1.UUCP (Alexis Rosen) writes:
>
>Recently, Tim Wood wrote a response to my article on database file structures
>...
>He missed ... that I was really addressing smaller
>machines. At any rate, I specifically excluded DBMSs which did raw I/O from my
>analysis.

I don't believe the PC focus was clear.  The article made rather
sweeping statements about different storage structures in general, but
the conclusions did not generalize beyond the single-user environment,
and maybe not within that.

As for raw I/O, if your DBMS does not take advantage of a mechanism in
the environment that offers potentially higher throughput, then it is
not the DBMS to use in your environment.  It's akin to saying "I'm
going to compare vendor X's bubble-sort vs. vendor Y's selection sort routines 
but ignore the quicksort from vendor W."

>
>BTW, the raw I/O stuff looks like distributed structure to me. 

No.  A single raw disk partition (of a certain minimum size) can hold
the entire set of databases and tables the DBMS sees.  This is likely
to give suboptimal performance (because no interleaving) and impose 
space constraints (i.e., need to grow beyond confines of the partition.)   
It's the most restrictive case of a scalable system (1 CPU, 1 disk).
But a flexible mono system will let you scale upward and choose your physical
layout.

>In fact, the 
>real breakdown, I guess, is between products that layer two levels of file 
>access on top of each other vs. the ones that only have one layer (i.e. they 
>either use the native file system for each of their logical files, or do raw 
>I/O and use their own file system).

That's a lot closer to the real issue.

>>>...Of course there are
>>>important logistics to consider, such as how to lock out access to data which
>>>is temporarily invalid because it is being updated privately by that node,
>>>but often this is not a problem at all. Even when it is, it's better that not
>>>being able to do the job at all. ]
>>
>>Unless you don't [like] inconsistent transactions and corrupt data.
>
>... I guess I was very unclear here. 
>What I was saying was that in many cases, with canned applications, you have 
>knowledge that certain tables will never be written to except by one user 
>(exceept on weekends, on Feb. 29, and during a Solar eclipse. Whatever...)
>In this case, you need not establish a lock and can just do whateer you like 
>without worries. This is what I meant by it being 'not a problem'.

The only problem is trivial applications.  Inability to support general
concurrency with consistency severely limits the ability of your system to
model the real-world problems you are trying to solve.   If you know
that your business is going to continue to operate from your garage 
with 3 employees for the forseeable, you might go with manual control.
If your workload is going to grow, though, that will quickly become
untenable.  It's much better to adopt a transaction-oriented DB design
from the start.  Then your applications can evolve at the level of 
your business problem rather than "well, I guess it's time to find a DBMS
that supports locking and revamp my applications for it".  

Moreover, with an active data dictionary, there will be concurrent
updates to the master object directory (e.g. "sysobjects"), even if no
user ever accesses an object in use by another, because the data
dictionary must maintain changing information about all the objects.

>
>>As for the disk strategies, they're fine.  I should mention that a
>>mono organization should not prohibit you from using other disk
>>devices (or files).  [etc.]
>
>Well, it's easy enough to do disk striping, but I'm not convinced that the DBMS
>is always going to be smarter than I am. What about prioritization? I want 
>certain users of table A, who don't need indices I, J, K, and L, to have very 
>fast response time. Other users of A, who need I J K and L, can go hang because
>they aren't upper management. :-) In this case I want to put A on one spindle 
>and everything else on another. I am not saying that it's impossible to write a
>DBMS which can handle things like this. I am saying that that would involve a 
>fair amount of AI, and I haven't seen anything like it yet.

That's not AI, that's database design.
The DBMS needs to offer the facilities for the DBA to specify physical
storage usage at various levels.  Good DB design and layout are powerful means
of obtaining performance.

Your example is puzzling, since indexes are supposed to make table access
faster.  You want the people who are hammering on A to use whichever of
I, J, K & L will give them the fastest response.  To do much besides 
adding a row at the end of A or truncating it, you should use the
indexes.  The parititioning you suggest is a worthwhile one.

>As I wrote at the beginning of this article, raw I/O fits my 'distributed'
>model more closely than it does the 'monolithic' model. 

Um, well, you're taking an argument against your conclusion and trying
to make it one in favor.  The raw partition is used because one desires
to use only the DBMS storage management structures, not those of the
OS.  That fits the mono definition much more closely than the separate-
file definition.

>Excluding this, though,
>does anyone think that mono structures have any big advantages?

Hard to say, since you've qualified the monolithic idea so as to make it
nearly meaningless.

>
>Anyway I am glad that you have taken the time to write the perfect DBMS 
>back-end. Now all you need to do is sell it for Macs and PCs (not under unix, 
>either) and I'll be very happy.

Oh, I see.  UNIX (& OS/2?) non grata.
You are very attached then to underpowered, facility-poor "OS"s like 
MS-DOS and Mac?   I suggest you stick to toy applications to match those
environments, then.

>Whew. Now, let me ask a question without making any sweeping statements:

For a change.

>One point we all missed is the possibility of combining multiple tightly- 
>related tables on the same physical portion of the disk. 

This is known generically as "clustering".  I believe Oracle does it.

>The speed advantage 
>here might be worth a mono structure (but I would guess not). Some SQLs allow 
>you to create "clusters". Now, what exactly is the DBMS writing to the disk? 

One idea is that you try to place pages for joining tables contiguously on
the same disk track.  Then, when a join query comes along, you can suck 
the first set of (potentially) joining rows into memory with one disk
read.  Then your join becomes an in-memory operation.   This is only
really effective if you are joining on the primary (i.e. sort) keys
of the tables, because that is the basis for the clustering.  It
is not very amenable to schema changes because it reduces independence
of the data from the storage structure.  

>The only way I can think of for doing this is to actually store a join of the 
>two files. This could chew up an enourmous amount of disk space. What is it 
>doing?

Fortunately, it's not done that way.  I suggest that you read some 
textbooks on databases, such as C.J. Date's _Intro to Database Systems_,
2nd. Ed. before making many more declarative postings.

-TW
{ihnp4!pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim
..not an @ in the bunch...