Newsgroups: comp.unix.admin Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!think.com!snorkelwacker.mit.edu!bloom-picayune.mit.edu!athena.mit.edu!jik From: jik@athena.mit.edu (Jonathan I. Kamens) Subject: Re: Non Destructive Version of rm In-Reply-To: navarra@casbah.acns.nwu.edu's message of 3 May 91 21:26:19 GMT Message-ID: Sender: news@athena.mit.edu (News system) Organization: Massachusetts Institute of Technology References: <144@larry.UUCP> <11283@statware.UUCP> <1991May3.212619.21119@casbah.acns.nwu.edu> Distribution: na Date: Mon, 6 May 91 04:15:12 GMT Lines: 98 John Navarra suggests a non-destructive version of 'rm' that either moves the deleted file into a directory such as /var/preserve/username, which is periodically reaped by the system, and from which the user can retrieve accidentally deleted files, or uses a directory $HOME/tmp and does a similar thing. He points out two drawbacks with the approach of putting the deleted file in the same directory as before it was deleted. First of all, this requires that the entire directory tree be searched in order to reap deleted files, and this is slower than just having to search one directory. Second, the files show up when the "-a" or "A" flag to ls is used to list the files in a directory. A design similar to his was considered when we set about designing the non-destructive rm currently in use (as "delete") at Project Athena and available in the comp.sources.misc archives. There were several reasons why we chose the approach of leaving files in the same directory, rather than Navarra's approach. They include: 1. In a distributed computing environment, it is not practical to assume that a world-writeable directory such as /var/preserve will exist on all workstations, and be accessible identically from all workstations (i.e. if I delete a file on one workstation, I must be able to undelete it on any other workstation; one of the tenet's of Project Athena's services is that, as much as possible, they must not differ when a user moves from one workstation to another). Furthermore, the "delete" program cannot run setuid in order to have access to the directory, both because setuid programs are a bad idea in general, and because setuid has problems in remote filesystem environments (such as Athena's). Using $HOME/tmp alleviates this problem, but there are others.... 2. (This is a big one.) We wanted to insure that the interface for delete would be as close as possible to that of rm, including recursive deletion and other stuff like that. Furthermore, we wanted to insure that undelete's interface would be close to delete's and as functional. If I do "delete -r" on a directory tree, then "undelete -r" on that same filename should restore it, as it was, in its original location. Navarra's scheme cannot do that -- his script stores no information about where files lived originally, so users must undelete files by hand. If he were to attempt to modify it to store such information, he would have to either (a) copy entire directory trees to other locations in order to store their directory tree state, or (b) munge the filenames in the deleted file directory in order to indicate their original locationa, and search for appropriate patterns in filenames when undeleting, or (c) keep a record file in the deleted file directory of where all the files came from. Each of these approaches has problems. (a) is slow, and can be unreliable. (b) might break in the case of funny filenames that confuse the parser in undelete, and undelete is slow because it has to do pattern matching on every filename when doing recursive undeletes, rather than just opening and reading directories. (c) introduces all kinds of locking problems -- what if two processes try to delete files at the same time. 3. If all of the deleted files are kept in one directory, the directory gets very large. This makes searching it slower, and wastes space (since the directory will not shrink when the files are reaped from it or undeleted). 4. My home directory is mounted automatically under /mit/jik. but someone else may choose to mount it on /mnt, or I may choose to do so. The undeletion process must be independent of mount point, and therefore storing original paths of filenames when deleting them will fail if a different mount point is later used. Using the filesystem hierarchy itself is the only way to insure mount-point independent operation of the system. 5. It is not expensive to scan the entire tree for deleted files to reap, since most systems already run such scans every night, looking for core files *~ files, etc. In fact, many Unix systems come bundled with a crontab that searches for # and .# files every night by default. 6. If I delete a file in our source tree, why should the deleted version take up space in my home directory, rather than in the source tree? Furthermore, if the source tree is on a different filesystem, the file can't simply be rename()d to put it into my deleted file directory, it has to be copied. That's slow. Again, using the filesystem hierarchy avoids these problems, since rename() within a directory always works (although I believe renaming a non-empty directory might fail on some systems, they deserve to have their vendors shot :-). 7. Similarly, if I delete a file in a project source tree that many people work on, then other people should be able to undelete the file if necessary. If it's been put into my home directory, in a temporary location which presumably is not world-readable, they can't. They probably don't even know who delete it. Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710