Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxn!ihnp4!ucbvax!jade!ucbopal!mwm From: mwm@ucbopal.berkeley.edu (Mike (I'll be mellow when I'm dead) Meyer) Newsgroups: net.micro.amiga Subject: Re: AmigaDos... Message-ID: <615@jade.BERKELEY.EDU> Date: Sun, 27-Apr-86 05:44:23 EDT Article-I.D.: jade.615 Posted: Sun Apr 27 05:44:23 1986 Date-Received: Fri, 2-May-86 07:34:12 EDT References: <1076@h-sc1.UUCP> Sender: usenet@jade.BERKELEY.EDU Reply-To: mwm@ucbopal.UUCP (Mike (I'll be mellow when I'm dead) Meyer) Distribution: net Organization: Missionaria Phonibalonica Lines: 217 In article <1076@h-sc1.UUCP> breuel@h-sc1.UUCP (thomas breuel) writes: >A complaint, and two questions: > >I have philosophical problems with the AmigaDos file system. >The file system appears to be optimised for finding a named entry >quickly and for making directories small and cheap: the directories >are hashed and do not contain the filenames. This means that for >wildcard expansion and select-file-dialog-box'es, one block has >to be read for each entry in a directory. True, the AmigaDOS file structure looses on globs. However: >I can do an 'echo *' or 'ls' and don't have to wait for minutes. Minutes? How about a couple of seconds/file. That's closer to what I get. For ~20 files, AmigaDOS runs about half the speed of Unix. But Unix is running on an RA81 and AmigaDOS is running on a floppy. If you want to do ls -l (AmigaDOS list), Unix gets extra block reads to get inodes, but AmigaDOS has the same number of reads as with a dir. >An unrelated, but (philosophically) similar problem is the use >of linear, linked blocks for file allocation tables. I'm not quite sure what you're referring to - the linked list of of data blocks, or the linked list of blocks with pointers to data blocks. The second one is what's important, and it resembles Unix, except Unix has a funky multi-level tree. For files with one or three allocation blocks, AmigaDOS and Unix both need either one or three disk references. For two allocation blocks, AmigaDOS needs two disk references, and Unix needs three. For things beyond three allocation blocks, AmigaDOS needs one reference per block, and Unix uses three or four (depending on Unix flavor) for all of them. Of course, some Unixes go to linked lists of lists of allocation blocks. However, with 1K blocks, any file small enough to fit on a floppy will fit in two or three blocks. >It appears that allocation blocks are not buffered, so that for >each seek, Dos has to scan through the whole linked list on disk. >Needless to say that this is dreadfully slow. Yeah - all two or three blocks. Some caching is done, though. At the end of this article, you'll find a program that times calls to Seek (the capitalization is correct - I used the AmigaDOS calls, not the stdio calls). The timings look flat, and suggest that the allocation blocks are cached, to some degree or another. The trackdisk buffer also shows up. It appears that doing a seek takes ~30 ticks, at least the second time you look in that area of the file (and I used a 700K+ file). But play with it yourself. of course, seeks *DO* seem somewhat slow. Someone oughta play with it some more (but I'm not gonna). Interestingly, the Read after the Seek always runs in zero ticks. This indicates something wrong - the system shouldn't do any physical IO for a seek until a Read or Write follows it. Even CP/M-80 got that right (at least, the BIOSes I wrote did). >The above two problems are compounded by the apparent lack of any >attempts to allocate, say, all allocation blocks on the same track >so that scanning through the allocation list may require many seeks. Actually, you want something like the 4BSD fast file system - trying to put blocks at the correct interleave on the same track/cylinder. Since the allocation blocks aren't used very often in normal use (see 'A Trace-Driven Analysis of the Unix 4.2 BSD File System' in the 10th SOSP), doing that with allocation blocks isn't suited to a multi-tasking system. The apparent caching of the allocation blocks when they are read in is probably optimal. >Now, even if the above statements are not entirely accurate Well, it looks to me like the one about seeks is flat wrong. The comment about doing globbing is true; I still can't figure out why they used globbing so heavily in the WorkBench. >the fact >remains that the AmigaDos file structure is, for practical purposes, >too inefficient. I hadn't noticed. But I use deep, narrow trees, so directory searching isn't murderous for me (the Browser has problems when I try to look at other peoples disk, with fat trees). >It discourages seeks and discourages directory scans. >Instead it gives some (doubtful) savings due to small directory sizes >and simplicity of algorithms for seeks (read lazy programmers ?!). Sorry, but the AmigaDOS file system is probably less space effecient that Unix. You add one block for file header, then put the name and link information in each block in the file. The savings in the directory are neglible compared to this. >I'm not writing this purely because I disagree with the file system >structure in some abstract manner, but because for several applications, >this file system is almost unusable (e.g. picking characters out of >a 700k Kanji font file). Give me a file system, I'll give you an application for which it's almost unusable. Doing high-quality DBM kills most systems that grew out of the Multics OS family. >It should also be noted that a simple disk cache at the disk driver >level is not necessarily going to improve matters. What is really needed >is a way to instruct Dos to keep the allocation tables for a specific >file in memory. No, you need a GOOD disk caching system - comparable to 4BSD, and better than the System V cache. The way AmigaDOS handles (or mishandles) memory management makes this nontrivial. I hope to take a crack at it sometime this year. >Ultimately, I find that the current file system should be replaced >with something more UN*X like (with name/inode pairs in directories, >inodes, and links). I'm not saying this because I find the UN*X >file system pleasing to the eye, but because it works, and because No, NO, NO! Going to an ilist with name/inode pairs DOUBLES the number of disk accesses to open a file by name! It also makes the file system less robust - a hit on the ilist is fatal to Unix, but the AmigaDOS file system can theoretically be recovered from any single-block hit. The AmigaDOS file system is theoretically faster than the Unix file system for most operations, the major exception being searching a directory. It trades space and links for robustness. On a home machine, this is a win - *IF* there's a tool for rebuilding broken disks. I understand that MetaComCo has one; why don't we have it? >So, one question, you have seen already: I need a workaround to access >random parts of a large file quickly. I would greatly prefer not to split >the file up... (and that wouldn't necessarily help anyhow). I suggest you use the same solution that people who need fast disk access on Unix used. Use the raw disk. There are examples of using the trackdisk driver floating around on the net; using that makes doing your seeks easy: divide the byte number you want by the number of bytes per track, and go read the track it's on. There's no way to make it go faster, except buying faster disks. #include #include static long file ; void do_seek(int), do_read(int), done(int, char *) ; void main(argc, argv) char **argv; { struct DateStamp How_Long, Time_Function() ; int whereto ; if (argc < 2) done(20, "usage: seektest ") ; if ((file = Open(argv[1], MODE_OLDFILE)) == 0) done(20, "can't open input file") ; for (;;) { printf("? ") ; scanf("%d\n", &whereto) ; if (whereto == 0) done(0, "exit") ; Seek(file, 0, OFFSET_BEGINING) ; How_Long = Time_Function(do_seek, whereto) ; printf("Seek took %5d ticks\n", How_Long . ds_Tick) ; How_Long = Time_Function(do_read, 0) ; printf("Read took %5d ticks\n", How_Long . ds_Tick) ; } } /* * Time_Function - just return a DateStamp saying how long it took * to call the given function with the given argument */ struct DateStamp Time_Function(function, argument) void (*function)(int); int argument; { struct DateStamp start, end, diff ; DateStamp(&start) ; (*function)(argument) ; DateStamp(&end) ; diff . ds_Tick = end . ds_Tick - start . ds_Tick ; if (diff . ds_Tick < 0) { diff . ds_Tick += (60 * TICKS_PER_SECOND) ; --(end . ds_Minute) ; } diff . ds_Minute = end . ds_Minute - start . ds_Minute ; if (diff . ds_Minute < 0) { diff . ds_Minute += (24 * 60) ; --(end . ds_Days) ; } diff . ds_Days = end . ds_Days - start . ds_Days ; return diff ; } /* * do_seek - seek to where in (global) file. */ void do_seek(where) int where; { if (Seek(file, where, OFFSET_BEGINING) < 0) done(20, "Seek failed") ; } /* * do_read - get one byte from the (global) file, and throw it away. */ void do_read(shit) int shit; { int junk ; if (Read(file, &junk, 1) < 0) done(20, "Read failed") ; } /* * done - just exit, no problems. */ void done(how, why) int how; char *why; { fprintf(stderr, "seektest: %s\n", why) ; exit(how) ; }