Path: utzoo!attcan!uunet!lll-winken!ames!ncar!noao!asuvax!nud!estinc!fnf From: fnf@estinc.UUCP (Fred Fish) Newsgroups: comp.unix.wizards Subject: Re: GNU-tar vs dump(1) Message-ID: <11@estinc.UUCP> Date: 7 Jan 89 18:50:42 GMT References: <17999@adm.BRL.MIL> <629@mks.UUCP> Reply-To: fnf@estinc.UUCP (Fred Fish) Organization: Enhanced Software Technologies, Inc. Lines: 27 In article <629@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes: >One of the potential problems with using tar or cpio for backups >is that a sparse file (one with unallocated blocks) >that uses little disk space will use more space in the backup. > (example deleted) > 24 -rw-r--r-- 1 egisin 100000004 Jan 3 17:59 big >This file uses 24K on the BSD filesytem, and about 100M in a tar backup. The problem with archive space consumed can be eliminated by compressing (LZW) sparse files during the archive process. This can be done totally transparently to the user. The example 100Mb file compresses to about 200Kb. During extraction, each block can be tested for the case of a block of null bytes (after decompression), and seeks used to recreate the hole. This test/seek is actually faster in practice than writing blocks of null bytes. I believe this is also independently confirmed by someone who posted their results of modifying "cp" to create sparse files. BRU (Backup and Restore Utility) uses both of these techniques, so this is not just speculation. A complete filesystem save and restore often results in additional free space recovered from files that could be sparse, but weren't. -Fred -- # Fred Fish, 1346 West 10th Place, Tempe, AZ 85281, USA # asuvax!nud!fishpond!estinc!fnf (602) 921-1113