Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!zaphod.mps.ohio-state.edu!sunybcs!uhura.cc.rochester.edu!rochester!udel!mmdf
From: ccplumb%rose.waterloo.edu@cunyvm.cuny.edu
Newsgroups: comp.sys.amiga
Subject: Re: AmigaDos directory knowledge
Message-ID: <5484@nigel.udel.EDU>
Date: 6 Dec 89 16:22:08 GMT
Sender: mmdf@udel.EDU
Lines: 97

In article <88.filbo@gorn.santa-cruz.ca.us> filbo@gorn.santa-cruz.ca.us (Bela
 Lubkin) writes:
>[note that I have directed followups to the .tech group]

Yes, but they know all this stuff already...

>In article <832@lpami.wimsey.bc.ca> Larry Phillips writes:
>>The big advantage of the Amiga filing
>     ^^^^^^^^^^^^^
>>system is that a fully known filename is found VERY quickly. The reason for
>>this is that filenames are hashed into tables in the directory blocks, and can

>>be found within a (relatively) few seeks, simply by looking in the appropriate

>>header and offset within that header to find the pointer to the file. Hash
>>'collisions' are handled by chaining from one directory entry to the next, but

>>even then, there are relatively few reads to do.

>Oh, come ois and always has been bogus.  Even MS-DOS
>does far better than AmigaDOS in this area.  In the case of a known filename,
>MS-DOS reads the directory (which is effectively a single small binary file),
>searches it for the named file, and is done.  Since each MS-DOS directory
>entry is 32 bytes, and a sector is 512 bytes, 16 entries fit in a sector; it
>must read (n/16) sectors to find a file, where n=the position of the desired
>file within the directory.  In general that means it has to read 1-2 sectors
>for MOST files.  It would have to read 5 sectors to find ANY file in my Amiga
>C: directory (70 files); 2 for my System: directory (31 files).  Those sectors
>would almost always be physically contiguous.  Compare this to AmigaDOS, which
>must read an arbitrary number of sectors (often 1, but sometimes more), which
>are virtually guaranteed to be all over the disk.

AmigaDOS has to read an average of n/72+1 sectors, which actually beats n/16,
but the problem is that they're scattered.  MS-DOS can get scattered, too, but
because it never shrinks directories, the preallocation reduces the effect.

FFS stores hash chains in sorted order, significantly reducing the scattering.

> Then compare wildcard
>operations.  MS-DOS must still read the entire directory, which is still only
>a few (probably contiguous) sectors.  Amiread one sector for
>each file in the directory, probably scattered all over the disk.  I don't
>like this tradeoff in the least.

Even reading all the sectors isn't a problem, the problem is the OFS's bad
habit of reading the sectors in an arbitrary order with no thought
*whatsoever* to efficiency.

The FFS searches them in optimal order, producing order-of-magnitude speedups.

>This is a case of overblown data structures.  I'd take a monolithic directory
>any day.  In AmigaDOS' case, this would probably involve keeping the current
>directory blocks, but adding a true index for each directory -- a file that
>has nothing in it but filenames and the block numbers of their directory
>blocks.  This could be a file of 34-byte records (30 characters + a longword
>block number).  It would probably be best to put 15 of these in a sector and
>use the extra two bytes for something else, rather than starting a 16th entry
>there and carrying it over to the 2nd directory entry, to make it easier to
>reconstruct a damaged file system.

Everyone keeps suggesting this, but there is a problem: there is no way
to ask the file system for this information.  There are any number of
problems with the ExNext call that I won't go into, except to mention
that I'd like to introduce the fellow who thought it up to the Spanish
Inquisition, in its heyday, but one is that there is no way to ask for
less information than that provided by "list," namely file name, size,
protection bits, comment, type, and DiskKey.  The ExNext call returns
all this, and all the software that takes directories uses ExNext, so
to have a better file system produce speedups with existing software,
you have to be able to find all of this information quickly.

>The other argument I've heard for AmigaDOS' directory structure is "unlimited
>directory size", which is also bogus.  Nothing limits any directory under
>MS-DOS, >except< the root directory.

And there's no good reason for that limitation.  No, AmigaDOS doesn't have
any monopoly on large diretories.  But searching long paths, it does very
well on.  The n/72 beats MS-DOS's n/16 quite handily.  when you consider the
number of times you have to find a specific entry in a directory to find
"foo" in path x:a/b/c;y:bar/baz/quux;z:gurble/glorble/frobnitz/greeble/thlorp,
MS-DOS jumps all over the disk, too.  (If you use assign instead of putting
long strings into the path, AmigaDOS wins even bigger.  But that's an orthogonal

issue that could, in theory, be implemented on top of MS-DOS's file system.)

>[I speak of MS-DOS for two reasons: I know it well; and it uses SIMPLE
>solutions which happen to work pretty well.  Perhaps in 6 months I'd be
>arguing for some form of UNIX file system or something, but I don't know
>it well enough yet.  Actually, what I suggest sounds suspiciously like
>inodes, in a twisted kind of way.]

Yes, the problem is when you do an ls -l.  Or better yet, open each file
in a directory.  There are all kinds of access patterns that require thought.
It is true that getting a list of all the names is a very common one that
should have more thought put into it, though.
--
        -Colin