Path: utzoo!attcan!uunet!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!aplcen!haven!adm!smoke!gwyn From: gwyn@smoke.brl.mil (Doug Gwyn) Newsgroups: comp.unix.questions Subject: Re: unix file structure (or lack of same) Keywords: unix, file, database Message-ID: <14335@smoke.brl.mil> Date: 5 Nov 90 14:41:49 GMT References: <125379@linus.mitre.org> Organization: U.S. Army Ballistic Research Laboratory, APG, MD. Lines: 38 In article <125379@linus.mitre.org> duncant@mbunix.mitre.org (Thomson) writes: >I understand that, on unix, the file system is designed so that a file always >looks like a sequence of bytes, with no record structure at all. To be more precise, the operating system itself does not impose any record structure on disk files within the standard hierarchical file system. Some device types, for example magnetic tape or punched-card reader, might have their own idea of what constitutes a "record" (normally each such record would have a length specified by the UNIX write() system call that provided its data, in the case of magnetic tape, or a particular fixed length, for a card reader). Also, the terminal handler under typical operation collects input from a terminal port up through a new-line and treats it in many respects as a (variable-length) record, although in this case partial, kernel-buffered reads are fully supported. >If so, how does one implement an efficient database manager on unix in >a standard, portable, way? To be efficient, a database manager needs to >have random access into files on a record-oriented basis. It seems to me >that fseek() wouldn't do the job. For normal disk files, applications are responsible for maintaining whatever structure they wish to use. Clearly, lseek() is suitable for getting directly to any known position within the file; if a fixed record size is assumed, then the arithmetic for the byte offset is trivial. For variable-sized records, a variety of organizations are possible. (In fact, this is a big win for the UNIX approach.) A typical one uses a separate "index file" with fixed, small record size that points into a large variable-sized record database file. B-trees and other structures are also commonly used. >If unix doesn't provide a record-oriented view of files, then any database >implementation would have to go below unix, and access the mass storage >devices directly. No, not at all, although a couple of database managers do support that mode in order to bypass the kernel overhead for the block-buffered inode-based file system.