Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!ucsd!sdd.hp.com!wuarchive!uunet!unhd!jwm775 From: jwm775@uunet!unhd (Jonathan W Miner) Newsgroups: comp.lang.c Subject: Re: hash function for text in C Message-ID: <1990Oct23.214712.14393@uunet!unhd> Date: 23 Oct 90 21:47:12 GMT References: <1990Oct16.184951.513@watdragon.waterloo.edu> Reply-To: jwm775@unhd.UUCP (Jonathan W Miner) Organization: Computing Information Services, University of New Hampshire Lines: 30 In article <1990Oct16.184951.513@watdragon.waterloo.edu> cpshelley@violet.waterloo.edu (cameron shelley) writes: > I have been (futily :<) experimenting with hash functions to >hash english words into a large array. No book I can find around >here gives the topic more than cursory discussion. Does anyone >out there know of a good, simple one? They must exist, and I'm >getting a little tired of mine... (er, function that is, I can't >afford another data structures book right now :). I am not an expert on hash functions, but can suggest a couple of methods: 1. Add the ascii values and 'mod' by the size of your array. 2. Or, a small modification: add only certain characters of each string. Say 1,3,5,7 and 11 and 'mod' by the size of the array. Both those methods, and many similar ones involve handling collisions in some manner. This can be messy (or so I'm told). To skip this problem, use a dynamic hashing function. This is refered to Fagin's Approach. It avoid's collisions by allocating space as needed and modifying the hash function. E-Mail if you want more info on this. ----------------------------------------------------------------- Jonathan Miner | I don't speak for UNH, and UNH does not jwm775@unhd.unh.edu | speak for me! (603)868-3416 | Rather be downhill skiing... or hacking... -- ----------------------------------------------------------------- Jonathan Miner | I don't speak for UNH, and UNH does not jwm775@unhd.unh.edu | speak for me! (603)868-3416 | Rather be downhill skiing... or hacking...