Path: utzoo!attcan!uunet!cs.utexas.edu!sdd.hp.com!ucsd!ucbvax!bloom-beacon!athena.mit.edu!jik From: jik@athena.mit.edu (Jonathan I. Kamens) Newsgroups: comp.unix.internals Subject: dealing with close() errors (was Re: On the silliness of close() giving EDQUOT) Message-ID: <1990Oct29.051212.13740@athena.mit.edu> Date: 29 Oct 90 05:12:12 GMT References: <15480@hydra.gatech.EDU> <1990Oct26.050448.26816@fts1.uucp> Sender: daemon@athena.mit.edu (Mr Background) Reply-To: jik@athena.mit.edu (Jonathan I. Kamens) Organization: Massachusetts Institute of Technology Lines: 67 In article , thurlow@convex.com (Robert Thurlow) writes: |> Even here, a workaround might be to have the |> process retry the close so the kernel will retry the NFS writes, after |> telling the user he is over quota so that he can try to delete some |> files on the server. If your process exited, _close() could just go |> ahead and burn the blocks out of the cache. If a user process tries to access a file/directory in an AFS volume that is currently being operated upon (e.g. moved to another fileserver, backed up, released to read-only from read-write, etc.) by the AFS servers, the process hangs in the call that is doing the accessing, and the kernel does a uprintf() telling the user something like, "afs: Waiting for busy volume 536870973 in cell athena.mit.edu" (that message is taken verbatim from when this happened to me this evening during the nightly backup of my home directory). The kernel then delays for a noticeable but relatively small amount of time (probably on the order of ten real-time seconds, although I can't say what the exact interval is) and tries to do the access again; if it fails again, the same message is printed. This loops until the access succeeds. It might be worthwhile to consider a similar approach to dealing with EDQUOT errors, both on write() and on close(). Although I'm not convinced I'd want the kernel to keep trying forever (heck, I'm not even sure it keeps trying forever in the AFS case -- it may eventually decide that something is screwed up on the server and return an error to the user process, which is almost certainly the right thing to do), I think it would be reasonable for the kernel to uprintf() a message about quotas and try to write a few more times, after suitable delays. This would give the user a chance to rectify the problem before data lossage occurs. Another possibility is to add a new system call, something like try_close(). It takes a file descriptor, just like close(), but only actually completes the close() if it is possible to do so without errors (although it should treat EBADF and EINTR the same way close() does, since there is nothing the programmer can do about them in any case). So, if a programmer is concerned about data integrity, he can do a try_close() before he does a close(), and if try_close() returns EDQUOT or some such thing, the program can print a warning and wait for advice from the user before continuing. We can generalize that and say that there should be a flush() system call that takes a file descriptor and verifies that all output to it has been performed and was successful. I believe that the hypothetical effects of such a system call can be simulated both on NFS and AFS files by doing lseek(fd, (off_t) 0, L_INCR) (substitute SEEK_CUR for L_INCR on a POSIX system, and/or 1 for L_INCR on a SysV system). A program which is paranoid about being sure that data gets written to disk can therefore define a macro vwrite that does something like so: static int _vwrite_tmp #define vwrite(fd,buf,nbytes) \ ((_vwrite_tmp = write(fd,buf,nbytes)) >= 0 \ ? flush(fd) >= 0 \ ? _vwrite_tmp \ : -1 \ : -1) I'm not sure whether or not I need more parentheses in there to force the grouping to the way I want, but you get the idea. (Credit where credit is due: The suggestion that started me thinking about try_close() comes from John Carr here at Athena, but any problems with the suggestions I've posted are of course completely my fault :-) -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710