Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!rutgers!ames!oliveb!sun!pepper!cmcmanis From: cmcmanis%pepper@Sun.COM (Chuck McManis) Newsgroups: comp.sys.amiga Subject: Re: Big directories vs. many directories Message-ID: <27583@sun.uucp> Date: Wed, 9-Sep-87 19:59:20 EDT Article-I.D.: sun.27583 Posted: Wed Sep 9 19:59:20 1987 Date-Received: Sat, 12-Sep-87 00:46:15 EDT References: <5215@jhunix.UUCP> Sender: news@sun.uucp Reply-To: cmcmanis@sun.UUCP (Chuck McManis) Organization: Sun Microsystems, Mountain View Lines: 24 In article <5215@jhunix.UUCP> ins_adjb@jhunix.UUCP (Daniel Jay Barrett) writes: > I have a question about searching your directory path. Is it more efficient > to have many small directories in your path, or to have fewer, larger > directories? [Bear with me if this is a repeat I'm a bit behind] Empirical evidence suggests that the two killers in directory performance are disk fragmentation and large files. Both of these are tickled by the fact that the current handler for disk files doesn't use the 'Size' field in the disk header to determine the size of the file and thus it seeks out and reads every extension block to determine the filesize. It can be argued that this is 'good' because is doesn't put any limit on file size but on floppies is a non-win since files can't span volumes anyway. To reduce fragmentation, format a disk and then use Copy All to copy all of the files over, on your data disks this can improved performance 10 - 50%. Another minor problem crops up when the number of files in the directory exceeds the optimal size given the hash table size of 76 entries. Lots of files cause hash collisions and thus extra lookups as the handler moves down the hash chain doing string compares. --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com These opinions are my own and no one elses, but you knew that didn't you.