Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!usc!randvax!segue!jim From: jim@segue.segue.com (Jim Balter) Newsgroups: comp.unix.internals Subject: Re: holes in files Message-ID: <5315@segue.segue.com> Date: 26 Dec 90 17:40:28 GMT References: <6193:Dec618:43:4390@kramden.acf.nyu.edu> <1820@b15.INGR.COM> <2809@cirrusl.UUCP> <11749@alice.att.com> Reply-To: jim@segue.segue.com (Jim Balter) Organization: Segue Software, Inc. - Santa Monica, CA. +1-213-453-2161 Lines: 45 In article <11749@alice.att.com> andrew@alice.att.com (Andrew Hume) writes: >In article <2809@cirrusl.UUCP>, dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: >actually, from what this thread has >uncovered, it might be safer to write non-zero data to avoid >smart filesystems. what scares me more are hyperintelligent >disk drives that have built in data compression and might be able >to take 20 blocks of some values but not be able to overwrite them >because of different compression rates. Obviously, the worst thing you can do is write zeros. Write random data. Better than using a random number generator on the fly is to precompute a block of data that looks like noise (there are various statistical measures for randomness (lack of signal)). While this isn't guaranteed to defeat all compression schemes, it greatly reduces the likelihood of too few blocks being allocated. When the odds of that happening are on a par with the odds that a plane will crash through the roof and destroy the disk drive, you can sleep better at night. Also, if you are writing critical real time applications, your hardware and OS are significant parts of the system and should be carefully specified so that they do not violate your requirements. Some people seem to think, though, that it is better to have inefficient disk drives or archivers to prevent breaking such programs as yours. st_blocks is so that programs (e.g., du, ls) can determine actual disk usage. It isn't for any other purpose, it is silly to try to imagine such purposes, and it is foolish, if you can think of such a purpose, to implement it. I'm sure that, if the designer had thought of it and had thought it necessary, s/he would have added something like "st_blocks is not a permanent attribute of a file; it may, for instance, change if a file is archived or is treated by a disk compacter." to the documentation. Pretend that this was said. Programs that read the disk directly are bypassing file logical structure and have no right to make any kind of assumption about the persistence of file attributes. As a general principle, people and programs care about files with holes turning into out of space conditions, but conversely they have no reason to object if out of space conditions turn into files with holes. Archivers that restore with holes are doing it right. They acknowledge that disk space matters (welcome to the real world) and that the presence of holes is invisible within UNIX file semantics except for st_blocks, which is a report value and not a permanent or persistent attribute of a file (welcome to conceptual clarity).