Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!ut-sally!husc6!rutgers!sri-spam!mordor!lll-lcc!ptsfa!hoptoad!academ!uhnix1!nuchat!steve
From: steve@nuchat.UUCP (Steve Nuchia)
Newsgroups: comp.unix.wizards
Subject: Re: symbolic links poster child
Message-ID: <244@nuchat.UUCP>
Date: Sun, 28-Jun-87 10:20:21 EDT
Article-I.D.: nuchat.244
Posted: Sun Jun 28 10:20:21 1987
Date-Received: Thu, 2-Jul-87 04:45:49 EDT
References: <7955@brl-adm.ARPA> <2249@bunker.UUCP>
Organization: Public Access - Houston, Tx
Lines: 43
Summary: ELOOP, loop detection algorithm

In article <2249@bunker.UUCP>, zink@bunker.UUCP (David Zink) writes:
> >   In article <2211@bunker.UUCP>, zink@bunker.UUCP (David Zink) writes:
> >   > What's wrong about more than eight symbolic [links]?
...
> I am continually appalled by the number of hard-coded constants:
> # of file descriptors, # of signals, # of characters in a filename or path.
> like 8 into the kernal?
... 
> I guess that my major complaint is that ELOOP purports to mean LOOP but
> really means EIGHT.

For reference, if anyone is in a position to fix this, there is a very
good loop detection algorithm that takes 1.5 X the time and 2 X the space
of the basic linear search it protects.  That is, the protected algorithm
as a whole has that complexity.

The algorithm is to run two pointers down your list, the trailing pointer
moving one element for every two scanned by the leader.  The two pointers
should be compared after each step of the leader.  Proof that all loops
are detected is left as an excersize.  Hint: if I remember correctly, the
loop must be detected within 1/2 the loop length after the trailing pointer
enters the loop.

For application to symlinks this state machine could operate only on the
symlink decodes, although it would have to decode any hard links between
the symlinks as part of the basic cycle.  Thus the extra work of moving
the trailing pointer would never come into play in cases with no more than
one symlink in the (expanded) path.  The space consideration is trivial:
it doubles (essentially) the local variables of nami().  Time-wise the
algorithm is a bit of a loss, because of the potential for reading blocks
into the buffer cache more than once in a single nami() _without_loops_.

Note that it would not be correct to implement the algorithm on all pathname
components, else foo/../bar would get an ELOOP when ./foo and ./bar are
both legal.

I'm pretty sure I saw this algorithm in one of Dijkstra's papers, though
I doubt it is original with him: probably a crufty old trick from the dawn
of time...

	Steve Nuchia
	(713) 334 6720 voice  334 1204  2400N81  "trouble" for anon mail to me.
	{housun,{academ,soma}!uhnix1}!nuchat!steve