Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!samsung!zaphod.mps.ohio-state.edu!sunybcs!sybil.cs.Buffalo.EDU!dill From: dill@sybil.cs.Buffalo.EDU (Peter Dill) Newsgroups: comp.sys.amiga Subject: Re: Another DOS for AMIGA? Message-ID: <18201@eerie.acsu.Buffalo.EDU> Date: 23 Feb 90 02:13:54 GMT References: <1059@trlluna.trl.oz> <350@xyzzy.UUCP> Sender: nobody@acsu.Buffalo.EDU Organization: SUNY @Buffalo/Comp Sci and Discount Muffler Lines: 66 In article <350@xyzzy.UUCP> poirier@dg-rtp.dg.com ( Poirier local) writes: > >They may be different, but not necessarily "completely" different. >For some hash functions and collision-handling methods, things can get >bad. I can't speak to AmigaDOS' hash algorithm, but there are many simple >hash functions where names differing by just the final letter give >hash values differing by just the difference of those letters. This can >generate a clump of consecutive entries, which can make hash collisions >more severe. Particularly so if the collision algorithm is "step ahead >to the next unused entry". Again, I don't know if Amy's hashing does this, >just speaking in general. Here is the hash function from FF20: main( argc, argv ) int argc; char **argv; { if( argc != 2 ) { printf( "Usage: %s \n", argv[0] ); exit( 1 ); } printf( "hash is %ld\n", (hash( argv[1] ) % 72) + 6 ); } hash( s ) unsigned char *s; { int i; int res; unsigned char *sp; unsigned c; res = strlen( s ); for( i = 1, sp = s; *sp; i++, sp++ ) { c = *sp; if( c >= 'a' && c <= 'z' ) { c = c - 'a' + 'A'; } res = ((res * 13 + c ) & 0x7ff); } return( res ); } It looks to me like keys that differ in just in the final character could generate hash values that differ only by the difference of the last value. Perhapse (res +c)*13 would be a idea to prevent this but I don't think it matters in the case of Amiga Dos anyway my quick look at the Dos manual made it look like hash values are chained together. So contrary to the assert- ion of the orginal poster similar file names (differing in the last character ) would make collisions *less* likely. So "dataX" and "dataY" are very likely to hash to different values unless X and Y are the same letter in different cases. Back to the the original post- was he *sure* he was using FFS on the floppy? BTW, no I don't know what 'i' is doing in that loop either :-). Maybe sp had an index at one time. Peter Dill dill@cs.buffalo.edu "I've got more data on wrestling than Voyager has on Neptune" -RRP