Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!bloom-beacon!bu-cs!madd
From: madd@bu-cs.BU.EDU (Jim Frost)
Newsgroups: comp.unix.questions
Subject: Re: Sparse Files ?
Message-ID: <30481@bu-cs.BU.EDU>
Date: 30 Apr 89 21:16:42 GMT
References: <19342@adm.BRL.MIL>
Reply-To: madd@bu-it.bu.edu (Jim Frost)
Followup-To: comp.unix.questions
Organization: Software Tool & Die
Lines: 43

In article <19342@adm.BRL.MIL> mark@ria-emh2.army.mil (Mark D. McKamey IM SA) writes:
|     What is the definition of a "Sparse file" in the UNIX world?

UNIX stores data in files by maintaining pointers to data blocks.  By
allocating only those blocks which have actually been written to, you
can create files which appear to be larger than they actually are.
These are usually created by lseek()ing and write()ing.

When you create an empty file, the system allocates a file information
block (called an inode) which contains a small list of block pointers.
This list is initially blank.  When we write into the file, the system
gets data blocks and sets the appropriate block pointer to point to
the block.

When we just create a file we get seomthing like this:

	ptr1 -> null
	ptr2 -> null
	ptr3 -> null

When we write to that file we get something like this:

	ptr1 -> block1
	ptr2 -> null
	ptr3 -> null

We deposit data into block1 until block1 is filled, then get another
block and set ptr2 to point to it.  If instead of just opening and
writing the file you open, seek into the file somewhere, and then
write, you can get something like:

	ptr1 -> null
	ptr2 -> block1
	ptr3 -> null

To the user it looks like he has a two-block file which has one block
of zeros (the system returns zeros for reads into null blocks), but to
the system he has only a one-block file.  This difference can add up
to a considerable savings in some cases.  For the normal case, this
behavior affects nothing.

jim frost
madd@bu-it.bu.edu