Path: utzoo!attcan!uunet!dasys1!alexis From: alexis@dasys1.UUCP (Alexis Rosen) Newsgroups: comp.databases Subject: Re: Databases: separate-file vs. monolithic file structure Message-ID: <6434@dasys1.UUCP> Date: 13 Sep 88 09:15:39 GMT References: <6299@dasys1.UUCP> <976@sybase.sybase.com> Reply-To: alexis@dasys1.UUCP (Alexis Rosen) Lines: 165 In article <976@sybase.sybase.com> tim@linus.sybase.com (Tim Wood) writes: >In article <6299@dasys1.UUCP> alexis@dasys1.UUCP (Alexis Rosen) writes: >>Recently, Tim Wood wrote a response to my article on database file structures >>...He missed ... that I was really addressing smaller machines. At any rate, >>I specifically excluded DBMSs which did raw I/O from my analysis. > >I don't believe the PC focus was clear. The article made rather >sweeping statements about different storage structures in general, but >the conclusions did not generalize beyond the single-user environment, >and maybe not within that. No. They don't generalize to OLTP environments on large systems. They DO generalize to most multi-user DBMSs running on micros. >As for raw I/O, if your DBMS does not take advantage of a mechanism in >the environment that offers potentially higher throughput, then it is >not the DBMS to use in your environment. [etc.] That's not the situation for micro users. In general that environment doesn't support raw I/O, which is why I excluded it. >>BTW, the raw I/O stuff looks like distributed structure to me. >No. A single raw disk partition [is a rotten idea...] You implied the ability to use multiple spindles with raw I/O. Obviously, limiting access to one partition will always be suboptimal. >But a flexible mono system will let you scale upward and choose your physical >layout. Then it's not a mono system. It's a hybrid, which is probably better than a pure form of either structure. (I should have mentioned this possibility originally, but again, there are no major DBMSs for PCs yet which offer this ability). >>In fact, the >>real breakdown, I guess, is between products that layer two levels of file >>access on top of each other vs. the ones that only have one layer [etc.] > >That's a lot closer to the real issue. > >> [discussion about various strategies to minimize record & table locks] > ... Inability to support general >concurrency with consistency severely limits the ability of your system to >model the real-world problems you are trying to solve. If you know >that your business is going to continue to operate from your garage >with 3 employees for the forseeable, you might go with manual control. >If your workload is going to grow, though, that will quickly become >untenable. It's much better to adopt a transaction-oriented DB design >from the start. [etc.] You're missing the point here. I agree that OLTP is a great thing. I'd love to be able to use it. But I can do things on a pc network without OLTP, that would take five times the money to do on a system powerful enough to use OLTP, WITHOUT increasing the chances of blowing away important data. It won't take enormous effort on my part, either. Of course, in five years or so OLTP will be a lot easier (with '786 or '080 CPUs and 32 MB of RAM in the average micro). For now, though, even large companies don't always have the extra money to throw at a problem that OLTP demands. They want it done on a micro network, next month, with industry-standard tools which hundreds of local programmers are already familiar with. >Moreover, with an active data dictionary, there will be concurrent >updates to the master object directory (e.g. "sysobjects"), even if no >user ever accesses an object in use by another, because the data >dictionary must maintain changing information about all the objects. So? Not too many micro DBMSs out there with any kind of data dictionary. That sucks, I know, but I can still do useful work in them. >>>As for the disk strategies, they're fine. I should mention that a >>>mono organization should not prohibit you from using other disk >>>devices (or files). Then it's not a monolithic structure, is it? >>Well, it's easy enough to do disk striping, but I'm not convinced that the >>DBMS is always going to be smarter than I am. What about prioritization? >>[example of prioritization/partitioning] I am not saying that it's impossible >>to write a DBMS which can handle things like this. I am saying that that >>would involve a fair amount of AI, and I haven't seen anything like it yet. > >That's not AI, that's database design. >The DBMS needs to offer the facilities for the DBA to specify physical >storage usage at various levels. Good DB design and layout are powerful means >of obtaining performance. [etc.] >The parititioning you suggest is a worthwhile one. That's AI when the DBMS figures out the partitioning itself. Without that, the DBA has to put various files in various different physical places, and that's not a mono structure anymore... >>As I wrote at the beginning of this article, raw I/O fits my 'distributed' >>model more closely than it does the 'monolithic' model. > >Um, well, you're taking an argument against your conclusion and trying >to make it one in favor. The raw partition is used because one desires >to use only the DBMS storage management structures, not those of the >OS. That fits the mono definition much more closely than the separate- >file definition. It's hardly an argument against my conclusion when I specifically excepted it. Regardless, monolithic (by my definition) means 'one physical file' (this does not prohibit disk striping). The DBMS you describe allows raw I/O on several different volumes, which may live on different machines. They are separate devices and separate physical files. It also allows the DBA to assign specific logical files (tables) to specific physical files. This is the outstanding characteristic of distributed structures. In fact, you're describing a DBMS which is a hybrid. As I said before, the hybrid is probably the best way to do things. If you disagree with my definition, fine. I'll agree to disagree... >>Excluding this, though, >>does anyone think that mono structures have any big advantages? > >Hard to say, since you've qualified the monolithic idea so as to make it >nearly meaningless. I haven't. Actually there is one advantage which we both forgot (thanks to Dennis Cohen who reminded me). Under certain OSs (especially Micro OSs) opening a file can take a great deal of time. There may also be serious limits on the number of files you can keep open at any one time. These are both problems for a distributed file structure. I don't believe they outweigh the advantages, though- at least not with MS-DOS or Mac OS. >>Anyway I am glad that you have taken the time to write the perfect DBMS >>back-end. Now all you need to do is sell it for Macs and PCs (not under unix, >>either) and I'll be very happy. > >Oh, I see. UNIX (& OS/2?) non grata. >You are very attached then to underpowered, facility-poor "OS"s like >MS-DOS and Mac? I suggest you stick to toy applications to match those >environments, then. I suggest you stick to what you know about (OLTP). These OSs, whatever their faults (and they are legion), dominate the PC market. Nevertheless I and many other people have developed many non-trivial applications in these environments. >>Whew. Now, let me ask a question without making any sweeping statements: > >For a change. >...I suggest that you read some >textbooks on databases, such as C.J. Date's _Intro to Database Systems_, >2nd. Ed. before making many more declarative postings. Don't be nasty. It doesn't advance your position. Should I suggest to you that you read an introductory book on microcomputers? All of the foregoing discusses whether or not various benfits I attributed to distributed file DBMSs are applicable to mono DBMSs as well. Except what I wrote about file-opening overhead, I have yet to hear any advantages the mono structure has. (This is _specifically_ a mono structure managed by the OS). Are there any? ---- Alexis Rosen {allegra,philabs,cmcl2}!phri\ Writing from {harpo,cmcl2}!cucard!dasys1!alexis The Big Electric Cat {portal,well,sun}!hoptoad/ Public UNIX if mail fails: ...cmcl2!cucard!cunixc!abr1 Best path: uunet!dasys1!alexis