Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxn!ihnp4!ucbvax!jade!ucbopal!mwm
From: mwm@ucbopal.berkeley.edu (Mike (I'll be mellow when I'm dead) Meyer)
Newsgroups: net.micro.amiga
Subject: Re: AmigaDos...
Message-ID: <615@jade.BERKELEY.EDU>
Date: Sun, 27-Apr-86 05:44:23 EDT
Article-I.D.: jade.615
Posted: Sun Apr 27 05:44:23 1986
Date-Received: Fri, 2-May-86 07:34:12 EDT
References: <1076@h-sc1.UUCP>
Sender: usenet@jade.BERKELEY.EDU
Reply-To: mwm@ucbopal.UUCP (Mike (I'll be mellow when I'm dead) Meyer)
Distribution: net
Organization: Missionaria Phonibalonica
Lines: 217

In article <1076@h-sc1.UUCP> breuel@h-sc1.UUCP (thomas breuel) writes:
>A complaint, and two questions:
>
>I have philosophical problems with the AmigaDos file system.
>The file system appears to be optimised for finding a named entry
>quickly and for making directories small and cheap: the directories
>are hashed and do not contain the filenames. This means that for
>wildcard expansion and select-file-dialog-box'es, one block has
>to be read for each entry in a directory.

True, the AmigaDOS file structure looses on globs. However:

>I can do an 'echo *' or 'ls' and don't have to wait for minutes.

Minutes? How about a couple of seconds/file. That's closer to what I get.
For ~20 files, AmigaDOS runs about half the speed of Unix. But Unix is
running on an RA81 and AmigaDOS is running on a floppy.

If you want to do ls -l (AmigaDOS list), Unix gets extra block reads to get
inodes, but AmigaDOS has the same number of reads as with a dir.

>An unrelated, but (philosophically) similar problem is the use
>of linear, linked blocks for file allocation tables.

I'm not quite sure what you're referring to - the linked list of of data
blocks, or the linked list of blocks with pointers to data blocks. The
second one is what's important, and it resembles Unix, except Unix has a
funky multi-level tree. For files with one or three allocation blocks,
AmigaDOS and Unix both need either one or three disk references. For two
allocation blocks, AmigaDOS needs two disk references, and Unix needs three.
For things beyond three allocation blocks, AmigaDOS needs one reference per
block, and Unix uses three or four (depending on Unix flavor) for all of
them. Of course, some Unixes go to linked lists of lists of allocation
blocks. However, with 1K blocks, any file small enough to fit on a floppy
will fit in two or three blocks.

>It appears that allocation blocks are not buffered, so that for
>each seek, Dos has to scan through the whole linked list on disk.
>Needless to say that this is dreadfully slow.

Yeah - all two or three blocks. Some caching is done, though. At the end of
this article, you'll find a program that times calls to Seek (the
capitalization is correct - I used the AmigaDOS calls, not the stdio calls).
The timings look flat, and suggest that the allocation blocks are cached, to
some degree or another. The trackdisk buffer also shows up. It appears that
doing a seek takes ~30 ticks, at least the second time you look in that area
of the file (and I used a 700K+ file). But play with it yourself.  of
course, seeks *DO* seem somewhat slow. Someone oughta play with it some more
(but I'm not gonna).

Interestingly, the Read after the Seek always runs in zero ticks. This
indicates something wrong - the system shouldn't do any physical IO for a
seek until a Read or Write follows it. Even CP/M-80 got that right (at
least, the BIOSes I wrote did).

>The above two problems are compounded by the apparent lack of any
>attempts to allocate, say, all allocation blocks on the same track
>so that scanning through the allocation list may require many seeks.

Actually, you want something like the 4BSD fast file system - trying to put
blocks at the correct interleave on the same track/cylinder. Since the
allocation blocks aren't used very often in normal use (see 'A Trace-Driven
Analysis of the Unix 4.2 BSD File System' in the 10th SOSP), doing that with
allocation blocks isn't suited to a multi-tasking system. The apparent
caching of the allocation blocks when they are read in is probably optimal.

>Now, even if the above statements are not entirely accurate

Well, it looks to me like the one about seeks is flat wrong. The comment
about doing globbing is true; I still can't figure out why they used
globbing so heavily in the WorkBench.

>the fact
>remains that the AmigaDos file structure is, for practical purposes,
>too inefficient.

I hadn't noticed. But I use deep, narrow trees, so directory searching isn't
murderous for me (the Browser has problems when I try to look at other
peoples disk, with fat trees).

>It discourages seeks and discourages directory scans.
>Instead it gives some (doubtful) savings due to small directory sizes
>and simplicity of algorithms for seeks (read lazy programmers ?!).

Sorry, but the AmigaDOS file system is probably less space effecient that
Unix. You add one block for file header, then put the name and link
information in each block in the file. The savings in the directory are
neglible compared to this.

>I'm not writing this purely because I disagree with the file system
>structure in some abstract manner, but because for several applications,
>this file system is almost unusable (e.g. picking characters out of
>a 700k Kanji font file).

Give me a file system, I'll give you an application for which it's almost
unusable. Doing high-quality DBM kills most systems that grew out of the
Multics OS family.

>It should also be noted that a simple disk cache at the disk driver
>level is not necessarily going to improve matters. What is really needed
>is a way to instruct Dos to keep the allocation tables for a specific
>file in memory.

No, you need a GOOD disk caching system - comparable to 4BSD, and better
than the System V cache. The way AmigaDOS handles (or mishandles) memory
management makes this nontrivial. I hope to take a crack at it sometime this
year.

>Ultimately, I find that the current file system should be replaced
>with something more UN*X like (with name/inode pairs in directories,
>inodes, and links). I'm not saying this because I find the UN*X
>file system pleasing to the eye, but because it works, and because

No, NO, NO! Going to an ilist with name/inode pairs DOUBLES the number of
disk accesses to open a file by name! It also makes the file system less
robust - a hit on the ilist is fatal to Unix, but the AmigaDOS file system
can theoretically be recovered from any single-block hit. The AmigaDOS file
system is theoretically faster than the Unix file system for most
operations, the major exception being searching a directory. It trades space
and links for robustness. On a home machine, this is a win - *IF* there's a
tool for rebuilding broken disks. I understand that MetaComCo has one; why
don't we have it?

>So, one question, you have seen already: I need a workaround to access
>random parts of a large file quickly. I would greatly prefer not to split
>the file up... (and that wouldn't necessarily help anyhow).

I suggest you use the same solution that people who need fast disk access
on Unix used. Use the raw disk. There are examples of using the trackdisk
driver floating around on the net; using that makes doing your seeks easy:
divide the byte number you want by the number of bytes per track, and go
read the track it's on. There's no way to make it go faster, except buying
faster disks.

	<mike
/*
 * seektest - a quick test to see if the time to seek to a byte is
 *	flat, or if it gets longer as you go farther into the file.
 */

#include <exec/types.h>
#include <stdio.h>
#include <libraries/dos.h>

static long	file ;
void		do_seek(int), do_read(int), done(int, char *) ;

void
main(argc, argv) char **argv; {
	struct DateStamp	How_Long, Time_Function() ;
	int			whereto ;

	if (argc < 2) done(20, "usage: seektest <file>") ;
	if ((file = Open(argv[1], MODE_OLDFILE)) == 0)
		done(20, "can't open input file") ;

	for (;;) {
		printf("? ") ;
		scanf("%d\n", &whereto) ;
		if (whereto == 0) done(0, "exit") ;
		Seek(file, 0, OFFSET_BEGINING) ;
		How_Long = Time_Function(do_seek, whereto) ;
		printf("Seek took %5d ticks\n", How_Long . ds_Tick) ;
		How_Long = Time_Function(do_read, 0) ;
		printf("Read took %5d ticks\n", How_Long . ds_Tick) ;
		}
	}
/*
 * Time_Function - just return a DateStamp saying how long it took
 *	to call the given function with the given argument
 */
struct DateStamp
Time_Function(function, argument) void (*function)(int); int argument; {
	struct DateStamp	start, end, diff ;

	DateStamp(&start) ;
	(*function)(argument) ;
	DateStamp(&end) ;

	diff . ds_Tick = end . ds_Tick - start . ds_Tick ;
	if (diff . ds_Tick < 0) {
		diff . ds_Tick += (60 * TICKS_PER_SECOND) ;
		--(end . ds_Minute) ;
		}
	diff . ds_Minute = end . ds_Minute - start . ds_Minute ;
	if (diff . ds_Minute < 0) {
		diff . ds_Minute += (24 * 60) ;
		--(end . ds_Days) ;
		}
	diff . ds_Days = end . ds_Days - start . ds_Days ;
	return diff ;
	}
/*
 * do_seek - seek to where in (global) file.
 */
void
do_seek(where) int where; {
	if (Seek(file, where, OFFSET_BEGINING) < 0)
		done(20, "Seek failed") ;
	}
/*
 * do_read - get one byte from the (global) file, and throw it away.
 */
void
do_read(shit) int shit; {
	int	junk ;
	if (Read(file, &junk, 1) < 0) done(20, "Read failed") ;
	}
/*
 * done - just exit, no problems.
 */
void
done(how, why) int how; char *why; {

	fprintf(stderr, "seektest: %s\n", why) ;
	exit(how) ;
	}