Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!munnari!kre
From: kre@munnari.oz (Robert Elz)
Newsgroups: comp.unix.wizards
Subject: Re: Symbolic Links
Message-ID: <1811@munnari.oz>
Date: Thu, 3-Sep-87 12:26:44 EDT
Article-I.D.: munnari.1811
Posted: Thu Sep  3 12:26:44 1987
Date-Received: Sat, 5-Sep-87 10:42:41 EDT
References: <8731@brl-adm.ARPA> <2789@ulysses.homer.nj.att.com> <1781@munnari.oz> <2912@ulysses.homer.nj.att.com>
Organization: Comp Sci, Melbourne Uni, Australia
Lines: 175

Its definitely time to stop this disussion, its getting nowhere, and
when alice!mvs starts posting its gibberish its best to shut up before
some clown decides to comment on that...

In article <2912@ulysses.homer.nj.att.com>, ekrell@hector..UUCP (Eduardo Krell)
writes:
> 1. Then it's my fault because I didn't follow your instructions. You told
>    me I could get to /usr/bin from /usr/include/sys, not /sys/h

But they're the same place.  Isn't that the whole point of a "link"?
Whichever name you refer to you get the same, identical, object, with
the same properties as all other names that refer to the same object.

If you're trying to make symbolic links hide themselves in the filesystem
and almost appear not to be there at all, I would have thought that this
would have been one of the fundamental properties you would have been
determined to preserve.

Certainly it is one that I demand, even though I am quiet willing,
even eagre, to change "symbolic link" to "pointer file" or some such
name where it doesn't envoke the same emotional response to how it
should behave, I still want object X to be object X, regardless
of its past history.  Can you really not imagine the confusion
such a change would cause?

> 2. When you created the symbolic link, you should have used "/usr/bin"
>   instead of "../../bin". Then it would work.

No, never.  Building in full path names to *anything* is a very
poor idea.  Of course, in this particular example it would probably
be safe, as /usr/bin rarely moves ... but in general a relative
pathname should be used in anything that's to be saved permanently,
(as opposed to things like command args, etc) to allow for the whole
object (tree) to be moved someplace else with impunity.

> I think symbolic links are useful and desirable.

Finally, we agree on something...

> The point is we have an opportunity to do it right this time.
> We shouldn't miss that opportunity.

If there's anything that's clear from this discussion, its
that there isn't a consensus of opinion on what is "right".

Given that, do you really want to be immortalized as the AT&T
person who forced your semantics of symlinks into a public
release, only to have it changed in the next release because
of the outcry?

Would you like to be whoever it was that suggested that Sys V.0's
compiler (or linker, or whatever) should require "extern" on all but
one instance of an extern variable, even though just about everyone
would agree that that change was "right"?

Maybe I should be more explicit on why I don't think that you can
possibly have implemented what you claim to have implemented, and
implemented it correctly.  Ed Gould provided a counter example
that you seemed to just shrug off without really understanding it.

Let me expand on that, and do it in a way where the usefulness of the
technique is a little more apparent.

First, let us agree that one of the properties of unix systems that
we do want to preserve, is that there isn't any system imposed limit
on how long a process can execute normally, if it can execute its
major code loop few a few thousand iterations, it should be able to
just keep on doing that forever.

There are all kinds of other limitations on processes (number of open
files, amount of memory, ...) but none that affect continuous execution.

Now, lets assume that I have an application with a HUGE database to
support.  Lets make it so huge that it isn't likely to fit in any
part of the file system tree that my customers are likely to have
available.  I could require them to rearrange their mounted filesystems
to provide lots of file space under one directory, but that's not
a very intelligent business decision for me to make, especially not
after Sys V gets symlinks, and I know I can rely on those.

It happens that my application's database can be nicely divided into
a number (let's say just two for this example, but that doesn't matter)
of separate filesystem trees.

Of course, I'm just delivering a binary to my customer, so he can't
recompile it to build in the directory names, and he wouldn't want to
anyway.  I could require the relevant directory names be passed as
args to the various commands, but that's tedious, even assuming shell
scripts to do it all.  (Tedious to set up initially, and tedious to
maintain as things move around later).  Using ENV vars for this is
simply wrong.

So, what I decide to do is have each of the major file trees contain symlink
pointers to the other file trees, for this example lets call the symlinks
"a", "b", etc (just two will do).  In the "b" tree I have a symlink "a"
that point at the "a" data, and in that tree a "b" symlink that points
at the "b" data.

All very simple.

Now the application looks something like this

	for (;;) {

		chdir("a");
		process_a_data();
		chdir("b");
		process_b_data();

	}

with no other chdir sys calls anywhere.

When my customer first installs my database, he buts the "a" data on
/disc1/a and the b data on /disc2/files/b then he (or my installation script)
does something like

	chdir /disc1/a
	ln -s /disc2/files/b b
	chdir /disc2/files/b
	ln -s /disc1/a a

The application starts, and runs for years without stopping, no
problems at all.

But after those years, my customer decides that he can afford a big new
disc, and on this he's going to be able to put all the files into
one tree.  Lets call the new place /disc3.

First he moves all the data files, then

	chdir /disc3
	ln -s . a
	ln -s . b

and he starts the application running.  How long does your implementation
give it before things start mysteriously failing?

Nb: there are no paths here with lengths longer than 1024.  Nor are
there any paths that breach the "8 symlink" rule.  [Aside: if you can
find a clean way to rmove that one, apart from just increasing the
number, I won't object at all .. its not easy though].

Please actually try this code on your implementation (make the "process"
functions be empty macros to avoid wasting time).  If it fails, then
I contend that your implementation is not "right".  Every couple of 
thousand times round the loop you could have the program fork and exec "pwd"
to provied some idea where in the tree it is, and where chdir("..")
will go.

If your implementation uses the "store the path" technique, then all
this will work, but you will have changed lots of other unix semantics,
other things just won't behave as they used to, and you've given no
indication that's how your implementation works.

Finally, others suggested generalized mount as a solution to this
problem.  I have no objection to that concept at all, however it
doesn't really solve the problem.

First, symlinks are user definable things, mount is generally
an administrators tool.  As a user I want to be able to make
pointers to directories, and I don't want to lose that ability.

Second, unless the semantics of mount are changed more than I think
was intended when this is done, when you "mount" /sys/h on /usr/include/sys
you are effectively removing /sys/h from its old position, and putting
it under /usr/include.  Whether than means that references to /sys/h
now fail, I don't know.  If they do, then that is not the object at all.
If they don't, then "/sys/h/.." would be "/usr/include" which doesn't
seem to be the object either.

So, generalized mount is a good idea, and can certainly be useful,
but it in no way helps solve the symlink problems.

kre