Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!unisoft!greywolf From: greywolf@unisoft.UUCP (The Grey Wolf) Newsgroups: comp.unix.internals Subject: Re: Why is restore so slow? Message-ID: <3357@unisoft.UUCP> Date: 7 Feb 91 22:03:56 GMT References: <19023@rpp386.cactus.org> <1022@eplunix.UUCP> <19028@rpp386.cactus.org> <15866.27b02da2@levels.sait.edu.au> Reply-To: greywolf@unisoft.UUCP (The Grey Wolf) Organization: Foo Bar and Grill Lines: 67 In article <15866.27b02da2@levels.sait.edu.au> xtdn@levels.sait.edu.au writes: >One such optimisation could be to write the raw disk to tape (actually you'd >only dump those blocks that contain data that you want backed up, but the >point is that you'd be reading from the raw disk). This would be quite fast >because you wouldn't be opening each file (which takes time), or reading the >file sequentially - see how much disk head movement you avoid? Now such a >tape would consist of a number of chunks, each chunk detailing the file, the >file offset, and the data to write at that offset. The restore process then >becomes a matter of reading the next chunk, opening and seeking the file, and >then writing the data. All that head movement, opening files, seeking to the >right spot, and later, closing files, would certainly slow down the process. > >I already said that I don't know how dump/restore works, but I would almost >be willing to bet that it's something like the scheme I just outlined. Maybe >someone who does know could tell us what really happens? You're not terribly far off, with the exception that UNIX doesn't keep a timestamp for individual blocks -- only inodes hold the timestamp, and there's no way to tell whether a particular block in the file has been updated (this would be terribly inefficient anyway -- chances are that if you've blown away a file, only having the changed blocks would be useless). Dump works by reading the disk partition directly -- it performs all the directory/file mapping on its own by reading the on-disk inode list for that partition. It looks in /etc/dumpdates to determine how recent changes have happened and, by looking at the inodes, makes an internal map of those inodes which have been affected within the requested period of time (with a "level 0" dump, everything since the beginning of time ( 4:00 pm, New Year's Eve, 1969 on the American West Coast ... (-:), and then starts mapping the directories in, dumping the directory information out and finally dumping the contents of the files. Wandering through the file- system by oneself and performing only the necessary operations is going to be much faster than sitting and going through the kernel's filesystem overhead. [ Side note: I *hate* operators who cannot think to keep track of the inode number of the file that is being dumped when they do multiple tape dumps! Makes restores a *pain*. ] Restore, on the other hand, is a dog. Why? It *has* to be. When files are getting restored, one cannot simply re-write the raw disk ; the filesystem overhead cannot be avoided on anything less than a full restore. Even there, a reason for avoiding just doing a raw data dump (via dd(1) (yes, I know that's not what dd stands for)) is that full backup/restores serve to reduce the disk fragmentation by putting everything back more or less contiguously. (We used to have to do this periodically back at the lab because class users had a tendency to produce lots and lots of little files. The /users file system would fragment ridiculously quickly over the semester. I think fragmentation reached about 5% (which is very high).) It's also kind of convenient that if a normal user wishes to effect a partial restore, he/she generally can, without having to be placed into a special group or be given super-user privileges. > > >David Newall, who no longer works Phone: +61 8 344 2008 >for SA Institute of Technology E-mail: xtdn@lux.sait.edu.au > "Life is uncertain: Eat dessert first" -- thought: I ain't so damb dumn! | Your brand new kernel just dump core on you war: Invalid argument | And fsck can't find root inode 2 | Don't worry -- be happy... ...!{ucbvax,acad,uunet,amdahl,pyramid}!unisoft!greywolf