Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!ames!hc!beta!cmcl2!brl-adm!adm!rbj@icst-cmr.arpa From: rbj@icst-cmr.arpa (Root Boy Jim) Newsgroups: comp.unix.wizards Subject: symbolic links are a botch Message-ID: <7724@brl-adm.ARPA> Date: Sat, 6-Jun-87 02:08:26 EDT Article-I.D.: brl-adm.7724 Posted: Sat Jun 6 02:08:26 1987 Date-Received: Wed, 10-Jun-87 02:38:46 EDT Sender: news@brl-adm.ARPA Lines: 112 I think that the reason for this discrepancy is that the implementation of symbolic links on BSD Unix is a botch. I think the key word here is "the". The kernel implements symbolic links as you describe below; the csh maintains $cwd symbolicly. If you do `cd /foo/bar' where /foo/bar -> /whiz/bang followed by `cd ..', $cwd gives /foo while pwd gives /whiz. Thus csh is interpreting `..' to mean `back' while the kernel takes it to mean `up'. In order to use a system with symbolic links without being surprised, you have to be aware of all symbolic links to directories. The reason is that the kernel does not keep track of the device-inodes as you descend the file tree, so it does not know how to walk back up. As you mention, the kernel has no sense of history (is it thus condemned to repeat it? :-), whereas the csh does. This is mentioned in the bugs section. For example, if I add a directory of include files named /usr/include/local and one of my include files includes ""../stat.h", it will not work on BSD but will work on Sys V. Can you imagine what Unix would have been like if .. were not handled specially at mount points? You would have to know where each of the mount points were. Two systems could be compatible only if they had identical mount points. It would be like Version 6, right? I think `cd ..' didn't work at the `root' of a mounted file system. I recently added symbolic links to my version of System V Release 3 kernel. I made sure that .. was always handled like an operator that moves you up one level. The changes were not complex and I have not run into any problems at all. The user again sees the file system as a tree structure which preserves the original simplicity of the Unix model. There is no way you can simulate a tree unless you ignore symbolic links, because it no longer *is* a tree. It is a directed graph, possibly with loops. In fact, symbolic links are no worse than when root makes links between directorys. The possibility exists for loops, and the notion of `up' is relative to the original directory, because it was made with mkdir, which created the `..' entry, whereas the replicant was made exactly like a regular link, merely adding an entry in *it's* parent directory and bumping the target inode link count. It remains childless, in the sense that it is no one's parent. By the way, I did modify find so that you can specify a virtual walk or a physical walk. I virtual walk follows symbolic links. The virtual walk keeps track of device-inodes so that I never descend through the same physical directory more than once. Therefore, a symbolic line to / poses no problem. What you do with find has nothing to do with what the kernel thinks. Find's job is to make some sense out of `everything in this subtree', while the kernel only cares how to get from here to there, one level at a time. Ignoring symbolic links turns the graph back into a tree; consider what tar does with symbolic links. It merely records their presence; it does not descend into any subdirectorys or recopy any files. Alternatively, considering them as first class citizens can flatten a graph into a tree (with various files or subdirectorys replicated as long as there are no loops). There are useful interpretations for both. ( Now some guru might say that . and .. are just names that have i-nodes associated with them just like any other name and therefore, the i-node should be followed. If this were true, then as root I should be able to edit my directory and change the inode of . to change its meaning. What do you think really happens?) Whoa! It is a long way from playing with the meaning of . and .. to editing directory as if they were files. Consider the `link' and `unlink' programs, main(c,v)char**v;{link(v[1],v[2]);} and main(c,v)char**v;{unlink(v[1]);} respectively. Now try `unlink .' followed by `link . ' followed by `cd .' followed by `/bin/pwd'. Don't be surprised if you get something random. Similarly for `..'. The kernel doesn't give a hoot, but you and I and fsck do. What would be the best semantics for ..? Exactly what they are now as far as the kernel is concerned. Do you think that BSD has done it right or have I done it right? At the next conceptual level, i.e. that of the shell, i.e. what happens when you type `cd ..', I can see several alternatives. Possibly the best is to not treat it specially. Alternately, one could consider it to be "rindex(cwd,'/')[0]=0; chdir(cwd)". What code would break if .. were always treated as an operator? What do you mean by operator? `UP', or `BACK'? Or something else? consider the case of loops, where after `descending' into the same directorys a few times and realizing what happened, do you want to retrace your steps in reverse, or go straight up as fast as possible. In the former case, `..' has a time dependent meaning. Disclaimer: I am not affiliated with the System V product, I am involved in research only and these are my own. They do not imply any direction for future releases of System V. Likewise :-) Perhaps I can sum up by saying that symbolic links are fraught with paradoxes; that even their creators have not addressed. They are much like trap doors. However, they are useful enuf to be suffered their anomalys. I hope I can live up to that standard. David Korn {ihnp4,decvax}ulysses!dgk (Root Boy) Jim Cottrell National Bureau of Standards Flamer's Hotline: (301) 975-5688