Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!hao!boulder!sunybcs!rutgers!gatech!mcdchg!usenet From: neilb@elecvax.eecs.unsw.oz (Neil F. Brown) Newsgroups: comp.unix Subject: Some thoughts on filenames - "" in particular Message-ID: <2423@mcdchg.UUCP> Date: Fri, 13-Nov-87 17:04:27 EST Article-I.D.: mcdchg.2423 Posted: Fri Nov 13 17:04:27 1987 Date-Received: Sun, 15-Nov-87 11:04:57 EST Sender: usenet@mcdchg.UUCP Organization: EE and CS, Uni of NSW, Sydney, Australia Lines: 106 Approved: usenet@mcdchg.UUCP Summary: What we need is a definition. [Swiped from comp.unix.wizards -mod] I once noticed the following: On a level 7 system (which we still use) chmod -x . # silly, but possible and I have seen it happen # as in chmod -x * .* ls -l # thinks: "damn" chmod +x . # damn, can't remember where I was pwd chmod +x "" # this works as "" IS the current directory, no directory search needed This is what first convinced me that "", not "." was really the current directory. On BSD4.2 it goes much the same way until you get to chmod +x "" This don't work on BSD!! I was shocked. Is NOTHING sacred? Now I HAVE to remember where I am. Such is "progress". Also, in V7, EVERY null terminated string was a potentially valid path name. 4.?BSD disallowed setting the 8th bit (though this may change when the realities of international character sets sink home). SVr2(?) disallows "" - bad to worse. Though I'm not sure, did they? The error is ENOENT. Does this mean I can ln file "" But no, as this would mean thing//thong is different to thing/thong. So now, not every strings is potentially valid, so the universe is less general. Such is progress. And what about the file name "foo/" One of the original documents ("The Unix File System"?) states that trailing slashes are stripped, so this is equivalent to "foo", or was. On a BSD system, try echo */ It probably only lists the directories. (It depends on the shell). On 4.2BSD at least, "foo/" is only accessable if foo is a directory. Personally, I prefer these semantics. But there is still a funny. try rmdir foo/ You get an error message like rmdir: foo/: Is a directory. Well, I know its a directory, thats why I used rmdir!! rmdir foo of course works. After considering all of this, I came to the conclusion that the best semi-formal semantics for Unix file names was SLASH = '/'+ # a non empty string of slashes NAME = [^/]+ # a non empty string of non-slash (non \0) chars A "file" is essentially a byte-stream (+ seek+ioctl+fcntl+...) A "directory" is a mapping from NAMEs to "file"s i.e. a "directory" is a function from the space of NAMEs to the space of "file"s path = filename # in which case path refers to a file/device/ socket/stream/etc. a "file" | dirname # path refers to a "directory" . dirname = SLASH # path refers to the root directory (for the process) | # path refers to current (working) directory | filename SLASH # The file named is to be interpreted in some # system dependent fashion as defining a "directory" # function. The path refers to that "directory". . filename = dirname NAME # the function dirname is applied to the NAME # to produce a "file" . Note that the empty string is never considered to be a directory entry; all directory entries are non-empty strings. [ At this point we could discuss what equivalence relation we will impose on name components - only 14 chars are significant, case is not significant... But thats off the track. ] In this system, "foo" is technically different from "foo/". You could reasonably make a read(2) on "foo/" return the system dependant representation of the directory, while a read on "foo/" returns the NAMEs (null terminated) which the directory while successfully map (i.e. put readdir into the kernel). Of course, this last thought would break much, so it probably ain't worth the effort. Now I'm not saying this is the way it IS, anywhere. I just think that its a particularly clean way to define the semantics of path names. If anyone has a different, complete, semi-formal definition, I would love to see it. Happy arguing. NeilBrown (Orginisation, address, etc in the header where they belong)