Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!sol.ctr.columbia.edu!cunixf.cc.columbia.edu!shenkin From: shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) Newsgroups: comp.sys.sgi Subject: Re: Questions bru-ing in my mind Message-ID: <1990Dec28.170722.18489@cunixf.cc.columbia.edu> Date: 28 Dec 90 17:07:22 GMT References: <1990Dec21.174239.11753@odin.corp.sgi.com> <1990Dec24.155048.29640@cunixf.cc.columbia.edu> <1990Dec26.200730.7738@odin.corp.sgi.com> Organization: Columbia University Lines: 83 [Consisting of further comments on Dave's reply, plus a new thread at the end.] In article <1990Dec26.200730.7738@odin.corp.sgi.com> olson@anchor.esd.sgi.com (Dave Olson) writes: >In <1990Dec24.155048.29640@cunixf.cc.columbia.edu> shenkin@cunixf.cc.columbia.edu (Peter S. Shenkin) writes: >| > = olson@anchor.esd.sgi.com (Dave Olson) > >| [[ bru -eZ ]] >| But if I could do this, I could do something like >| bru -evZ / >& tmpfile >| at the beginning of my backup day. It would take a few hours to run, >| but then I could feed tmpfile into an awk script or other simple program which >| would divvy the files up into single-volume-sized groups, and then I could >| bru the groups one by one. > >I've been convinced by you and other people's responses that -eZ >should work. I'll change that for the next major release, probably >with a warning message at the start that it will take a long time. Thanks, Dave. Just a comment: 'bru -eZ' won't take any longer than 'bru -cZ', just as 'bru -e' won't take any longer than 'bru -c', and warning messages get to be very annoying. I suggest you save the warning message for the manual -- and put it under Z, not e! In fact, my 4d25 can't stream the tape drive using 'bru -cZ', even with virtually nothing else going on. >Note that making such a partitioning at any time prior to the backup >always introduces the 'risk' that the size of that directory tree may >increase due to new files, growing files, core files, etc., so it may >not fit no matter what you do... This is possible, but hardly likely if you are either (1) leaving a margin of error appropriate to your system, or (2) backing up a single-user workstation by popping a tape in as you leave for the day. Yes, of course you might have jobs running in background that could be growing files, but I still think we deserve to be able to make the best guess we can, and of course bear the consequences when we guess wrong. After all, if you couldn't shoot yourself in the foot with it, it wouldn't be UNIX. NEW THREAD: I've been thinking about something that I first noticed several years ago when restoring a multi-user VAX from a 0-level (ie, full) dump tape plus several incrementals, following a disk crash. When you do such a restore, you get all the files that were there as of the time of the last incremental, but you also get files -- a whole lot of files, in my experience -- that users had deleted since the 0-level dump was made. That is, you don't really restore the file system; you get a lot of chaff in there along with all the wheat. I personally found that weeding the extraneous stuff out was a real chore. And where disk space is tight, this process could actually overflow available storage. Is this enough of a problem for people to consider a backup strategy that eliminates the problem? One way to do this would be: each time you do an incremental backup, make a complete list of all files present, as well as copies of those files that have changed "recently". When restoring from an incremental tape, have an option that deletes from disk any file that is not on the list -- or, alternatively, instead of deleting it, putting it in a special place, such as a duplicate file-tree built under, say '/delete'. It would be possible for users to emulate this functionality as follows. Each time an backup is done, first write a "table of contents" of the file system to some place on the disk, and make sure this table is included in the backup. If it becomes necessary to do a complete restore, a new table of contents could be made following the final incremental restore. A program could read this new table, and look check for the presence of each entry in the old table, then make a list of the entries in 'new' that are not in 'old'. In fact, if the same program makes both tables, then 'diff' will suffice for this. Then another program could go through the 'diff' list and delete or move the files in it. 'find / -print' could be used to make the tables. Well, now that I've said that, it seems so straightforward that I have no suggestions to Dave. But it seems such a departure from the backup strategies I am aware of that I'd like to get peoples' opinions of it. I think this would also lengthen the practical time interval necessary between full backups. It may be that some people do this already, but if so I'm unaware of it. -P. ************************f*u*cn*rd*ths*u*cn*gt*a*gd*jb************************** Peter S. Shenkin, Department of Chemistry, Barnard College, New York, NY 10027 (212)854-1418 shenkin@cunixf.cc.columbia.edu(Internet) shenkin@cunixf(Bitnet) ***"In scenic New York... where the third world is only a subway ride away."***