Xref: utzoo comp.sys.mac:19961 comp.databases:1344 comp.sys.mac.programmer:2297
Path: utzoo!utgpu!attcan!uunet!pyrdc!pyrnj!dasys1!alexis
From: alexis@dasys1.UUCP (Alexis Rosen)
Newsgroups: comp.sys.mac,comp.databases,comp.sys.mac.programmer
Subject: Re: Databases: distributed vs. monolithic file structure
Message-ID: <6213@dasys1.UUCP>
Date: 2 Sep 88 11:18:34 GMT
Lines: 231


Recently, Jonathan Krueger (jkrueger@daitc.daitc.mil) wrote:
> [lots of stuff which I'll get to in a second]

I guess I made a mistake. When I originally wrote this article I was
specifically looking at the Macintosh and DBMSs available for it. When I was
done I saw that what I wrote was equally applicable to PCs. When I posted this
article to comp.sys.databases, I forgot that nowhere in my article did I
specifically say that I was focussing mainly on micros. So Jonathan is
absolutely correct about some things, in a large-systems environment. I feel
that what I originally wrote stands, in a Microcomputer context.

He does make, however, many statements which I think are wrong, regardless
of context. Here goes:

>In article <6178@dasys1.UUCP> alexis@dasys1.UUCP (Alexis Rosen) writes:
>>If your DBMS has to go through its own file-management code as well as
>>the OS's, it will always be slower than if it only needed to go
>>through the OS.
>
>This is incorrect.  The cost of operations like (create, destroy,
>find, append, delete, replace) is a function of the data structure
>used, not where the code resides.  If you implement a relational
>database by putting each table in a file, you use the data structure
>structure defined by the filesystem.  If you put all tables into one
>large file, you define your own data structure.  Either way, the data
>structure will optimize for certain operations and against others.
> [etc]
>		One Large File		One File per Table
>		==============		==================
>find table	seek through file	open file
>find row	seek through file	seek through file
>update row	write location		write location

It is true that you are always working within the context of _somebody's_ data
structure. However, it is likely that your file system is smarter than your
database (on a micro, at least). Still, this is a weak point.

However, as far as finding and updating rows, I think that the table above
cheats a lot. For example, to retrieve a row: Given that you've computed the
absolute offset from the beginning of the file to the row you want, the
distributed structure only requires the file system to seek to that offset.
Very simple.

The monolithic structure creates a huge headache by comparison. Remember, it's
not so easy as saying "seek to (row offset + table offset from beginning of
file)". You can't simply store multiple tables one after the other. You need to
be able to mix blocks of one table with blocks of another. You need pointers to
track the blocks of each file. You need master file pointers to track the file
block pointers... Or you need a similar, equally convoluted scheme. So
initially accessing a row (or even the index's pointer to that row) takes a lot
more work. Similarly, linear searches through the file (which should be avoided
like the plauge, usually) are much worse. Obviously, you can cache pointers to
all these things, but there is always a limit on how much you can cache, and
caching that stuff means that you are caching less important stuff: your data.

In fact, using a monolithic structure implies that you are creating your own
special-purpose file system that sits on top of the regular file system.
Therefore, I think that Jon is wrong in all cases, EXCEPT in the case I noted
before: DBMSs that bypass the file system entirely. Then they are on a fairly
equal basis.

>>There is a much bigger performance gain for distributed structures in
>>a multiple-machine or multiple-hard-disk environment.  There is a very
>>large file with several indices associated with it. Many people use
>>this file, and some of those use the indices. Even with a very fast
>>disk, access is slow, and the file is too large for a cache to provide
>>significant help. The solution is to put the indices on a separate
>>hard disk (same server).  This results in immense speed improvements.
>
>The assumption seems to be that you can't do this with a single big
>file.  People do this all the time: it's called disk striping.
>Breaking data into smaller files has little to do with spreading data
>across multiple spindles.

True, for large systems. Not true, generally, for micros. (There are some
special drivers for MS-DOS that do this, for some hardware. They are not
widespread, to the best of my knowledge).

Even given that it is available, how can you think that disk striping (which is
not aware of the structure of your single large data file) can be more
efficient that a human being intelligently deciding how to allocate storage? As
an example, with the distributed structure I can decide that indices A, B, and
C should sit on a separate disk. With disk striping and a monolithic file, I'll
get random pieces of every table and index in my database on that disk.

>>If you have a group of files which are generally used for look-up
>>information, and usually not written to, you can put them all, along
>>with their indices, on a RAMdisk.
>
>Or you can use your RAM and your time more profitably: disk caching,
>virtual memory, code profiling, and dare we say it, database design.

Code profiling (and related analysis, such as data-flow modeling) should be
independent of this. Likewise for virtual memory.

Still, I see that we have a fundamentally different view of how smart DBMSs are
(and are likely to get in the near future). Caches definitely have their place.
The best solution in these instances would be the ability to tell your DMBS
"cache this entire file". Unfortunately, I don't know of any DBMS that can do
this. Barring that, I am much smarter that any cache algorithm I've met
recently, so I just stuff critical look-ups in the RAM disk.

In fact, I see this kind of decision as part of database design. More on that
later...

>>There are many nodes on a LAN all accessing a database, doing
>>fundamentally different jobs with the same data. If all of the files
>>(tables) in the database are left on the server, you have an
>>especially bad bottleneck problem if any node other than the server
>>wants to do some large-scale data manipulation. In particular, on
>>slower [paths] With a monolithic database you are S.O.L.
>
>This is incorrect.  Again, you can distribute pieces of a file to
>multiple nodes just as you can spread it across multiple disks.

I guess this depends on what file system you are using. Still, what I said
before stands: I (and you) can better determine where to allocate data than the
file system can.

>>but with a distributed structure you can just send the relevant fields
>>(columns) of the relevant records (rows) to a hard disk local to the
>>node which needs to manipulate the data. When the task is finished,
>>the data is reloaded into the main file(s).
>
>If you can't update tables without regard to physical location, it's
>not a distributed structure.  If you trade transparent access for
>application-specific speed, you're behind on the deal.  For instance:
>
>>Of course there are important logistics to consider, such as how to
>>lock out access to data which is temporarily invalid because it is
>>being updated privately by that node
>
>If you have to do your own record locking, hardcode data location
>into applications, and retune performance for every new disk, why use
>a database manager at all?

Because it's better than writing my code in C. Seriously, I think Jon has a
somewhat myopic view here. He seems to be working with the "true relational"
model, but that's not always practical. Location independence is wonderful in
theory, but I may need to sacrifice it for performance's sake without wanting
to lose all the other conveniences of a database system. That's why I'm willing
to worry about hand-locking those records, if necessary (and it's not always
needed).

In general, I am describing techniques to gain major performance improvements.
They aren't free. Some extra coding is going to be necesary. That's life. No
DMBS I know of is smart enough to even begin doing this intelligently.

>>but often [locking out rows by hand] is not a problem at all. Even when it
>>is, it's better than not being able to do the job at all.
>
>Wrong twice. It's better to get a slow answer than a wrong one.  And
>in no sense is one prevented from doing the job, adequate tools exist.

You misunderstand me. If it's a problem, then you need to write some code to
prevent the problem. That may not be necessary, if the columns updated in
private are never written to by other users. (This can often be assured.)

Adequate tools don't exist if the tools can't do the job in less time than the
cycle from one excecution of the job to the next, for example.

>>This technique...can lighten the load on your network considerably.
>
>This depends entirely on how well you predict which pieces of data
>will be needed where.  If you guess poorly, it will increase the load.
>Again, this has nothing to do with whether you partition your data by
>table or other unit.  For instance, you could keep one table per file
>and split each file among multiple nodes.

Clearly, I am predicting things well. If I can't, I wouldn't try to mess with
the location of the files. A good database engineer should, generally, be able
to make intelligent guesses along these lines, and careful study of a live
system should give even better information.

I don't see what this 'for instance' shows.

>>There is one other very important reason to use a distributed
>>structure that comes to mind. Any monolithic structure will impose
>>arbitrary restrictions on the number of data files or fields (tables
>>and columns) allowed in the database.
>
>This is incorrect.  Again, the data structure determines whether you
>can implement fixed or flexible field sizes, field or row width
>limits, restrictions on number of fields or tables.  Consider a
>monolithic tree.

Well, true. This is a case where large and small systems differ. All of the
monolithic-style systems on micros that I know of (I know of fairly many) have
arbitrary limits that can cramp one's style rather severely.

>>If you have a very large database, it may not fit on one physical
>>disk, and with the monolithic structure you are limited (generally) to
>>one device. With the distributed structure, these limitations just go
>>away.
>
>This is incorrect.  Several commercially available operating systems
>support disk striping, bound volume sets, and the like.  For those
>that don't, the limitations don't "just go away": what happens when a
>single table must grow larger than the disk?

Your OSs has some sort of limit on the logical size of volumes. Extend the
argument from 'physical disk' to 'logical volume' and it's pretty strong. At
worst, the distributed structure just pushes back the boundaries.

>No, the reason why it's convenient to put each table into a file is
>that we have a lot of tools that act on files.  It's good software
>engineering to use them on tables.  For instance, the directory
>listing program usually provides file size; in the monolithic
>structure, that functionality has to be provided elsewhere.  For
>another instance, the backup/restore utility knows how to restore
>files.  For the monolithic structure, that complicates its ability to
>recover from disasters.

Yes. This is a good reason for the distributed structure. There are many others
as well.

----

Let's turn this whole subject on its head. Are there any significant benefits
to a monolithic structure? I have mentioned the DBMSs that bypass the file
system, and I'm sure that doing that right is more efficient, but that's really
alot like a distributed system in disguise. Anything else?


----
Alexis Rosen                       {allegra,philabs,cmcl2}!phri\
Writing from                                {harpo,cmcl2}!cucard!dasys1!alexis
The Big Electric Cat                  {portal,well,sun}!hoptoad/
Public UNIX                         if mail fails: ...cmcl2!cucard!cunixc!abr1
Best path: uunet!dasys1!alexis