Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!think.com!zaphod.mps.ohio-state.edu!mips!wrdis01!emory!gatech!prism!scott From: scott@prism.gatech.EDU (Scott Holt) Newsgroups: comp.unix.aix Subject: Re: AIX 3.1 File system mystery Keywords: AIX disk space file system RS/6000 Message-ID: <29391@hydra.gatech.EDU> Date: 19 May 91 21:29:10 GMT References: <509@nwnexus.WA.COM> Organization: Georgia Institute of Technology Lines: 35 In article <509@nwnexus.WA.COM> wjones@nwnexus.WA.COM (Warren Jones) writes: >I've observed something very mysterious in our RS/6000 file system: >"ls -l" shows a file of ~24 Mbytes, but "du" shows the directory >using only ~17 Mbytes. Can anyone out there offer an explanation? >The following script file tells the whole story: > .... The file may be "sparse" - on some UNIX file system implementation, if any entire block of the file contains only zeros, the appropriate block pointer in the inode may be set to zero rather than the location of a disk block containing data. The idea is why allocate disk space to something you know contains only zeros. This is typical of database files (esp those that use mdbm) and other applications which write randomly to a file. It also is not a unique property of AIX. Word of warning about such files - backup programs love them - and I do mean this to be taken sarcasticly. When you back the file up, a typical backup program will read the file sequentially. When this is done, it doesn't matter much that a block contains all zeros. It is very possible that a file "appears" much larger than even the total amount of space on your disk. When this file is backed up, it will take up its apparent size on the backup media. Worse yet, when it is restored, the restore program may not "sparsify" the file - that its, it will try to restore it back to its apparent size and then you have real problems. I don't know how AIX backup and restore deal with this (any comments from IBM?), but most other backup schemes (such as tar and cpio) deal with it a naive manner. This too is something not unique to IBM. - Scott -- This is my signature. There are many like it, but this one is mine. Scott Holt Internet: scott@prism.gatech.edu Georgia Tech UUCP: ..!gatech!prism!scott Office of Information Technology, Technical Services