Path: utzoo!attcan!uunet!dg!rec From: rec@dg.dg.com (Robert Cousins) Newsgroups: comp.unix.questions Subject: Re: sparse files Message-ID: <238@dg.dg.com> Date: 11 Dec 89 17:02:29 GMT References: <21581@adm.BRL.MIL> <235@dg.dg.com> <2700@auspex.auspex.com> Reply-To: uunet!dg!rec (Robert Cousins) Organization: Data General, Westboro, MA. Lines: 78 In article <2700@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: (I wrote) >>UNIX treats the "holes" as 0's when read. In fact, UNIX has only >>minimal support for sparse files. Backing up sparse files often >>involves copying large amounts of nulls. Once an area of a file is >>written, it cannot be returned to its previous sparse state. >Not in general, anyway. At least the first version of AIX for the RT PC >claimed, in its documentation, that it had an "fclear()" call to punch >holes in files; I think this may show up in future releases of other >UNIXes as well. It is unclear whether support for sparse files is necessary. My only point is that at one time they were very popular amongst a particular class of heavy DP applications. Today we have the technology to more effectively use system resources. Don't forget that B-trees are relatively recent inventions! >>In arguments that UNIX is not suitable for DP applications, sparse >>files usually come up if the conversation goes on long enough between >>knowledgeable people. >Umm, what other operating systems support sparse files *and* return a >"there's a hole there" indication? For instance, are there any OSes >with extent-based file systems (VMS, OS/360 and successors as I >remember, IRIX with SGI's Extent File System) that support sparse files? There are a number of OS's which support sparse files. An incomplete list of them includes: TurboDOS (1.3 and later) S1 (all revs if my memory is correct) RM/COS IBM System 3 os (I think, its been 10 years) VM VMS CP/M (Its not really an os but . . . . it is extent based) Any operating system which supports honest-and-for-true ISAMs In fact, a number of OS's designed for COBOL or RPG support have these features. Anyone care to add to the list? It is true, however that newer operating systems don't support sparse files. However, add-ons such as VTAM, do still support it. One reason for the dimise of sparse files is the lack of support for the concept of records in more popular operating systems (UNIX, DOS, etc.) It is much more difficult to treat a file as a sparse collection of bytes efficiently than it is as a collection of records. Several of the above mentioned operating systems were plagued with handling sparse files in some form of system imposed record scheme. Often this system-imposed scheme did hide the "sparseness" from programmers under certain circumstances. For example, I have been told that VMS allows programs to sequentially read a sparse file and skip over gaps in the file. ISAM files were intrensically sparse. ("ISAM" is a term which has recently been corrupted to mean "Keyed indexed access system of some form" instead of the traditional surface/track/sector indexing scheme.) As an aside, TurboDOS used sparse files as the extension mechanism for files. To extend a file, one would lock the region beyond the end of the file, write to it (implicitly extending the file) and then release the lock. Since file locks were for system imposed quantities, it was possible for a program to create a sparse file by accident. If one program wanted to write 1k bytes but the lock quantity was set at 2k bytes, it would have to lock the entire physical record (2k bytes) which would cause any program attempting to extend the file at the same time to skip beyond the lock region (over the second half of the 2k bytes) and do the same thing. Effectively a sparse file was created where the file ended in 1k of written data, 1k of "nothing", and 1k of written data. Depending upon other circumstances, it was possible that the sparse area could be shown as either unwritten (and return sparse file status) or under certain obscure cases it would show to contain the previous contents of some physicla disk sectors. This made porting some business applications quite difficult since business applications tend to depend upon shared files extended in real time. Applications properly written could use sparse files to their own advantage without difficulty, however. Robert Cousins Dept. Mgr, Workstation Dev't. Data General Corp. Speaking for myself alone.