Path: utzoo!attcan!uunet!zephyr.ens.tek.com!uw-beaver!mit-eddie!bloom-beacon!athena.mit.edu!jik From: jik@athena.mit.edu (Jonathan I. Kamens) Newsgroups: comp.unix.internals Subject: Re: Trojan Horses Message-ID: <1990Oct26.005843.12463@athena.mit.edu> Date: 26 Oct 90 00:58:43 GMT References: <1885@necisa.ho.necisa.oz> <5238:Oct2322:14:3690@kramden.acf.nyu.edu> <1893@necisa.ho.necisa.oz> <8645:Oct2521:49:5790@kramden.acf.nyu.edu> Sender: daemon@athena.mit.edu (Mr Background) Reply-To: jik@athena.mit.edu (Jonathan I. Kamens) Organization: Massachusetts Institute of Technology Lines: 188 In article <8645:Oct2521:49:5790@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: |> Now let's take an error matching my quoted description. close() is not |> documented as returning EDQUOT. "The close(2) man page (at least on my system) has not been updated to reflect the fact that close() can return EDQUOT. Just like it doesn't say (at least on my system) that close can return EINTR. I have to check for EINTR, even though it isn't documented, but I don't have to check for EDQUOT." Very logical. |> There's no logical reason to expect it |> to return EDQUOT "In my opinion, there is no logical reason to expect it to return EDQUOT." That's wonderful, but several filesystem implementors and kernel wprogrammers have disagreed with you. Some of the code written by those implementors and programmers are in wide use all over the Unix world. So, your program can either accept the decision they've made, and be more robust, or ignore it, and work less robustly and less reliably. |> ---even if write()'s buffering weren't hidden below its |> interface, it could easily return the error accurately. "In my opinion, it should be possible to implement any filesystem, local or remote, so that the kernel can always return quota errors to a user process immediately upon write(), and not delay them until close()." You suggested one possible way to do this in another posting, Dan, and I asked you to expand on it, since I didn't understand what you were saying. I have seen no response to my request. Why not? In any case, I will repeat what I've said already -- Dan, in *my* opinion, it is *not* necessarily possible to implement any filesystem reasonably and guarantee immediate quota reports. As far as I can tell, there are several possible ways to do this: 1. Each time a remote filesystem/device is accessed for the first time, the kernel has to get and store the user's quota on that device. Each time *any* filesystem operation is performed on that device from that point forward, the kernel has to adjust the quota value appropriately. This means write()s, seek()s, unlink()s, mkdir()s, link()s, symlink()s, mknod(), and anything else I forgot. 2. All filesystem calls must be completely synchronous, and must be completed by the kernel before control returns to the user process. This way, the kernel can get quota errors from the fileserver immediately. 3. The kernel has to "reserve" space on the remote filesystem before performing any file operations, and "ask" for more space whenever it uses up the space it's asked for. Now, let's analyze why I think that none of these approaches is reasonable: 1. First of all, there are many calls that affect the filesystem for which the kernel *does not need to know* the effect on the filesystem in terms of space. By requiring the kernel to keep track of quotas, you are requiring that it understand the lowest level details of the remote filesystem, so that (for example) it knows that adding a file to a directory will take away xx bytes from the user's quota. All levels of abstraction and protocol layers vanish when the local kernel has to be all-knowing about the remote filesystem. Of course, you could require that the remote filesystem *tell* the client kernel how much space was allocated/released by each filesystem operation. But that's just a degeneration into case (2), which I will deal with below. The other problem is that remote filesystems can have more than one kernel operating on them at a time. Kernel A can't know if Kernel B is also operating on the filesystem, with the net result being complete quota chaos if, for example, I'm compiling a program for two different platforms on two different machines in the same remote filesystem. You could solve this problem by requiring the remote filesystem to notify all current clients of the space allocation/freeing whenever any client performs any filesystem operation. Gee, now there's a model of efficiency for you. Now the fileserver has to keep a current list of all clients who are using each filesystem on it (if that's even possible), and has to communicate with them whenever anybody does a filesystem operation. 2. One of the greatest features of AFS, for example, is that files don't actually go out over the network until you're done modifying them (unless you do a seek(), I believe). Requiring completely synchronous operation would eliminate much of the speed advantage of using AFS. I like AFS. In my opinion, it's a great filesystem. I'm willing to pay the cost of AFS's great performance by checking for EDQUOT after close() in my programs. 3. Have you ever rented a car? You know how they put a really big almost-charge (I forget what the word they use for it is) on they credit card in case you do something nasty? Did you know that if that puts you over your limit, you won't be able to use the card for anything else until you return the car, and possibly not for several days after that, since the car rental company may not bother to clear the charge until it expires automatically. Would you like the same thing to be true with filesystems? I think I may need a meg in the filesystem, so I ask for it. But in reality, the user process for which I asked for that meg is only going to write a 1 meg file. That 1 meg puts me close to my quota. Now, no other kernel anywhere can write to my filesystem until the user process finishes. If I crash, how does the remote fileserver find out that the pre-allocation is no longer needed? |> There's no good |> way to handle EDQUOT if it comes up "In my opinion, there's no good way to handle EDQUOT if it comes up." As I have already pointed out, Dan, almost all of our filesystem access at Athena is on filesystems that have the "problem" we are discussing. And yet we have very little, if any, problem with programs that can't "handle EDQUOT" in a "good way". You have asserted, as fact, that there is no good way, and yet I can give you any number of examples of programs that have found a good way. Therefore, your "fact" must not be a "fact". Now, it is true that *for some programs*, there will not be any good way to deal with EDQUOT from close. Just as, for example, /bin/login can't deal with the setuid() failing, so it gives up. We have that problem because Ultrix doesn't deal with user ID's as high as BSD 4.3 does, so some of our users can't log in on our Ultrix workstations. The worst thing a program can do is say, "Woops, I just noticed an error," and abort. |> ---even if I had some reason to |> suspect that close() could return the error, I guess that the fact that two of the most popular remote filesystems in use today allow close() to return the error is not "some reason to suspect" that it will happen? You're living in a fantasy world, Dan. You don't want close() to return EDQUOT. Well, it does. Either your programs deal with that fact, or they fail inexplicably. |> I simply *cannot* replay |> data that might have been buffered in the same file by a previous |> process. This is a straw man, Dan. No one is claiming that every program can recover from a failing close(). It's a fact of life that programs sometimes encounter error conditions from which they cannot recover. If you cannot recover, you give up, after telling the user that you cannot recover. If you *can* recover, on the other hand, you do. It's that simple. |> Tell me: Why should I handle EDQUOT? Why should I interpret it |> as some sort of error? Who benefits if I thrash about upon this error? You should handle it because your program can be more robust if you handle it. You should handle it because if you don't, your programs will fail and the user will never find out about it. You should handle it because it's there, and because until it's *not* there, your customers are going to complain about your failure to handle it. If you want to go on a crusade against close() failing with EDQUOT, feel free to do so. But I submit that until your crusade is successful, you are nothing but stupid to say, "Since it shouldn't fail with EDQUOT, I'm not going to check to see if it fails with EDQUOT." And I'd say that emacs users benefit incredibly from the fact that emacs "thrash[es] about upon this error." Because they don't lose files that they would otherwise lose without ever being told that they were lost. |> Not at all. This might be true for calls that give me information, but |> close() is not such a call. As I have already illustrated at length in other messages, this assertion is no longer true. In modern, 1990 Unix, close() does give the programmer information. As I've already said, if you choose to ignore that information because you believe religiously that it shouldn't exist, then your code is less robust than it can be. |> Do you check the return value of assert()? This is just stupid, and it's just like your argument about setuid() returning ENOENT. Are you claiming that there is just ONE version of Unix, ANYWHERE, where it is even remotely possible that assert() will return a useful value? I don't know of any. I doubt there are any. Assert() is not documented as *ever* returning an error indication. On the other hand, close() *is* documented as returning an error indication, and furthermore, there are already many versions of Unix for which close() can fail with EDQUOT. They exist. They are being used. And you have to cope with them. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710