Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!spool.mu.edu!agate!csam.lbl.gov!csam!clarsen
From: clarsen@lbl.gov (Case Larsen)
Newsgroups: alt.hackers
Subject: Re: daft way to undo rm
Message-ID: <CLARSEN.91Jun13152022@intruder.lbl.gov>
Date: 13 Jun 91 20:20:22 GMT
References: <1991Jun12.231044.14542@spider.co.uk>
	<1991Jun13.023922.26166@ibmpcug.co.uk>
Sender: usenet@csam.lbl.gov
Distribution: alt
Organization: Coalition for Properly Tail Recursive Languages
Lines: 50
Approved: joe-code@csam.lbl.gov
In-Reply-To: dylan@ibmpcug.co.uk's message of 13 Jun 91 02:36:29 GMT
Nntp-Posting-Host: intruder.lbl.gov


Late last year, someone "accidentally" typed 'rm -rf /usr/users' on
one of my friend's NeXT machines (Mach/Unix with the BSD 4.3
filesystem) as root.  Yes, I think they are still working there.  Of
course, a lot of people were angry.  "No big problem", you say, "just
restore from backup tape."  But my friend wasn't keeping backups
because of some silly reason--media cost, most likely.

At this point, we 'dd' and compress all 330 megabytes of /dev/rsd0a to
a file for later perusal.  (Eventually, we'll have to spread the 330
megs across multiple disks because there just wasn't that much free
space on any one disk.)  Another friend and I try a feeble attempt to
try and recover inode information from the disk.  As I remember, the
directory inodes of /usr/users were still intact, but the inode
pointers to the subdirectories had been zeroed out.

For attempt #2, we get a version of 'fsdb', pull out the inode
manipulation code and write a pattern scanner for the disk image.  The
first thing we do map out all the blocks of the disk that contain
something "textual" and "important".  "Textual" and "important" rules
out blocks containing what looks like system logs, mail messages,
binary data, etc.

At this point, I think there were about 50 megabytes left.  We then
took the map files for the text blocks and had the scanner search
through them for certain files that the professors think they lost.
All that was needed was a string of text, such as the title of the
paper, or the professor's name.

As soon as a block containing the string was found, the consecutive
blocks before and after the block could be dumped to a file.  In
almost all the cases, the blocks of the original file were both
consecutive and in order.  The time taken from being given the search
text, to dumping a copy of the text file was about 30 minutes.  Most
of this was spent searching through the 50 megs of text blocks.

The saving grace of the BSD filesystem was that for small
files (less than 8K), an entire 8K block was allocated instead of a
scattered bunch of smaller 1K fragments.  I'm not sure what happens
with files larger than 8K, but at some point, they start to get
staggered to improve transfer time and striped across the cylinder
groups. We never tried to recover files that large, so it wasn't an
issue.

--
Case Larsen
clarsen@lbl.gov           uunet!ucbvax!lbl-csam!clarsen
Voice: (415) 486-5601     Fax: 486-5548
Lawrence Berkeley Laboratory,
One Cyclotron Road -- MS 50F, Berkeley CA 94720