Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!dali.cs.montana.edu!caen!sdd.hp.com!think.com!snorkelwacker.mit.edu!bloom-picayune.mit.edu!athena.mit.edu!jik
From: jik@athena.mit.edu (Jonathan I. Kamens)
Newsgroups: comp.unix.admin
Subject: Re: Non Destructive Version of rm
Message-ID: <1991May7.095912.17509@athena.mit.edu>
Date: 7 May 91 09:59:12 GMT
References: <11941@mentor.cc.purdue.edu>
Sender: news@athena.mit.edu (News system)
Distribution: na
Organization: Massachusetts Institute of Technology
Lines: 156


  (I have addressed some of Bruce's points in my last posting, so I will not
repeat here any point I have made there.)

In article <11941@mentor.cc.purdue.edu>, asg@sage.cc.purdue.edu (The Grand Master) writes:
|> Explain something to me Jon - first you say that /var/preserve will not
|> exist on all workstations, then you say you want a non-differing 
|> environment on all workstations. If so, /var/preserve SHOULD 
|> exist on all workstations if it exists on any. Maybe you should make
|> sure it does.

  The idea of mounting one filesystem from one fileserver (which is what
/var/preserve would have to be, if it were to look the same from any
workstation so that any file could be recovered from any workstation) on all
workstations in a distributed environment does not scale well to even 100
workstations, let alone the over 1000 workstations that we have, and our
environment was designed to scale well to as many as 10000 workstations or
more.

  If it doesn't scale, then it doesn't work in our environment.  So we can't
"make sure" that /var/preserve appears on all workstations.

|> 	However, what Jon fails to point out is that one must remember
|> where they deleted a file from with his method too. Say for example I do
|> the following.
|> $ cd $HOME/src/zsh2.00/man
|> $ delete zsh.1
|>  Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES
|> to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I 
|> DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in 
|> the directory it was deleted from. Or does your undelete program also
|> search the entire damn directory structure of the system?

  Um, the whole idea of Unix is that the user knows what's in the file
hierarchy.  *All* Unix file utilities expect the user to remember where files
are.  This is not something new, nor (in my opinion) is it bad.  I will not
debate that issue here; if you wish to discuss it, start another thread.  I
will only say that our "delete" was designed in conformance with the Unix
paradigm, so if you wish to criticize this particular design decision, you
must be prepared to criticize and defend your criticism of every other Unix
utility which accepts the same design criterion.

|> This is much better than letteng EVERY DAMN DIRECTORY ON THE SYSTEM
|> GET LARGER THAN IT NEEDS TO BE!!

  How many deleted files do you normally have in a directory in any three-day
period, or seven-day period, or whatever?

|> Say I do this
|> $ ls -las
|> 14055 -rw-------   1 wines    14334432 May  6 11:31 file12.dat
|> 21433 -rw-------   1 wines    21860172 May  6 09:09 file14.dat
|> $ rm file*.dat
|> $ cp ~/new_data/file*.dat .
|> [ note at this point, my directory will probably grow to a bigger
|> size since therre is now a fill 70 Meg in one directory as opposed
|> to the 35 meg that should be there using John Navarra's method]

  First of all, the size of a directory has nothing to do with the size of the
files in it.  Only with the number of files in it.  Two extra file entries in
a directory increase its size negligibly, if at all (since directories are
sized in block increments).

  Second, using John Navarra's method, assuming a separate partition for
deleted files, I could do this:

1. Copy 300meg of GIF files into /tmp.

2. "rm" them all.

3. Every day or so, "undelete" them into /tmp, touch them to update the
   modification time, and then delete them.

Now I'm getting away with using the preservation area as my own personal file
space, quite possibly preventing other people from deleting files.

  Using $HOME/tmp avoids this problem, but (as I pointed out in my first
message in this thread), you can't always use $HOME/tmp, so there is probably
going to be a way for a user to spoof the program into putting the files
somewhere nifty.

  You could put quotas on the preserve directory.  But the user's home
directory already has a quota on it (if you're using quotas), so why not just
leave the file in whatever filesystem it was in originally?  Better yet, in
the degenerative case, just leave it in the same directory it was in
originally, with the same owner, thus guaranteeing it will be counted under
the correct quota until it is permanently removed!  That's a design
consideration I neglected to mention in my previous messages....

|> [work deleted]
|> $ rm file*.dat
|> (hmm, I want that older file12 back - BUT I CANNOT GET IT!)

  You can't get it back in the other system suggested either.

  I have been considering adding "version control" to my package for a while
now.  I haven't gotten around to it.  It would not be difficult.  But the
issue of version control is equivalent in both suggested solutions, and is
therefore not an issue.

|> Well most of us try not to go mounting filesystems all over the place.
|> Who would be mounting your home dir on /mnt?? AND WHY???

  In a distributed environment of over 1000 workstations, where the vast
majority of file space is on remote filesystems, virtually all file access
happens on mounted filesystems.  A generalized solution to this problem must
therefore be able to cope with filesystems mounted in arbitrary locations.

  For example, let's say I have a NFS home directory that usually mounts on
/mit/jik.  But then I log into one of my development machines in which I have
a local directory in /mit/jik, with my NFS home directory mounted on
/mit/jik/nfs.  This *happens* in our environment.  A solution that does not
deal with this situation is not acceptable in our environment (and will
probably run into problems in other environments as well).

|> Is this system source code? If so, I really don't think you should be 
|> deleting it with your own account.

  First of all, it is not your prerogative to question the source-code access
policies at this site.  For your information, however, everyone who has write
access to the "system source code" must authenticate that access using a
separate Kerberos principal with a separate password.  I hope that meets with
your approval.

  Second, this is irrelevant.

|> But if that is what you wish, how about
|> a test for if you are in your own directory. If yes, it moves the
|> deleted file to $HOME/tmp, if not, it moves it to ./tmp (or ./delete, or
|> ./wastebasket or whatever)

  How do you propose a 100% foolproof test of this sort?  What if I have a
source filesystem mounted under my home directory?  For all intents and
purposes, it will appear to be in my home directory.  What if I have a source
tree in my home directory, and I delete a file in it, then tar up the source
directory and move it into another project directory, and then realize a
couple of days later that I need to undelete the file, but it's not there
anymore because it was deleted in my home directory and not in the project
directory?

  How do you propose to move state about deleted files when hierarchies are
moved in that manner?

  Your suggested alternate solutions to this problem, which I have omitted,
all save state in a way that degenerates into saving the state in each
directory by leaving the files there.  Furthermore, something that has not yet
been mentioned, the implementation of a set of utilities which leaves the
files in place is far less complex than any other implementation.  And the
less complex an implementation is, the easier it is to get it right (and
optomize it, and fix any bugs that do pop up, etc.).

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710