Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!sdd.hp.com!hplabs!hpl-opus!hpccc!hpcc01!hpcea!hpldsla!djw From: djw@hpldsla.sid.hp.com (David Williams) Newsgroups: comp.sys.hp Subject: Re: find and cpio Message-ID: <3140019@hpldsla.sid.hp.com> Date: 13 Sep 90 17:15:48 GMT References: <1150@prlhp1.prl.philips.co.uk> Organization: HP Scientific Instruments Division - Palo Alto, CA Lines: 56 I hope this makes sense, I've re-written it a couple of times, but there are just too many "words".... > On HP-UX (6.21) we use their backup script which effectively does: > > cd / > find . -hidden -print | cpio -ocxa | tcio ...... > > Am I right in thinking that find works by looking in directories > and -print's everything it finds, which of course means that > several directory entires hard-linked to the same file will be > picked up individually and passed to cpio for dumping ? That's pretty much the way that it works. Note though, that cpio saves the file inode number (and the device number of the file's file system), and the number of links in the archive. This leads to... > When performing a restore using cpio I assume that each file read > is allocated a fresh inode and therefore what may have been > one file hard-linked several times will be restored as several > individual files (taking up more disk-space than originally used). Not quite. When doing the restore, cpio(1) tracks files in the archive that have a link count greater than one. Cpio -i saves the pathname of the first file loaded, along with the inode/device number in the archive. If another file in the archive has the same inode/device number it is linked to this first file. Simple hay? So only the first 'file' in an archive allocates a new inode on the target file system, all the others are linked to it - as desired. Note, back on the dump (cpio -o) side of things, the N number of file links are each archived as a complete file. I guess this means tapes get filled up more than needed, but it means on restore you don't have to start with the first tape to get the "real" file. This is a different strategy than used by some other tools which just use pointers for all links after the first - resulting in a "go find tape number N if you want file 'blah'" type of message sometimes. Ftio(1) use a similar strategy to cpio for storage of the links. Tar(1) (and I think fbackup(1)) go the pointer strategy for saving the non-zero'th link. Hope that helps, David Williams ___________________________________________________________________ Hewlett-Packard Scientific Instruments Division (SID) /\___________ 1601 California Ave, Palo Alto, CA, USA. /\______________/\________ phone: 415 857 6100. FAX: 415 852 8011 //\\____________|__________ HP-UX Mail: djw@hpldsla.hp.com / \____/\____/\___________ HPdesk: (djw)hpldsla/HP1900/00 /\____________/ \__________