Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!mouse From: mouse@thunder.mcrcim.mcgill.edu (der Mouse) Newsgroups: comp.lang.c Subject: Re: What Data Structure? Keywords: Hash Table Trie Dictionary Message-ID: <1991Jun28.134149.29412@thunder.mcrcim.mcgill.edu> Date: 28 Jun 91 13:41:49 GMT References: <1991Jun24.110300.3740@ecsvax.uncecs.edu> Organization: McGill Research Centre for Intelligent Machines Lines: 41 In article <1991Jun24.110300.3740@ecsvax.uncecs.edu>, burgin@ecsvax.uncecs.edu (Robert Burgin) writes: > I am writing a program that includes a dictionary look-up of 24,500 > English words. Each entry in the dictionary will contain the word > itself and a string representing valid syntactic tags for that word > (noun, verb, etc.). > Would a hash table be best for the implementation? Or a trie? Why? Yes. :-) Seriously though, there is no single data structure that can definitely be said to be best, given no more information than you have given. For example, how dear is memory? If you can afford to dedicate several megabytes to the dictionary data structures, want a reasonably fast lookup, no inserts or deletes at run-time, and can afford to spend some CPU time preprocessing the dictionary, there's one reasonably simple answer. If you're tight on memory, need to insert and delete, can afford moderately slow lookups, and can't afford to do much preprocessing, something else would be better. > I worry about collisions in the hash table and wonder what the ideal > number of buckets would be for this case. That depends on the hash function. For example, if you can afford the preprocessing time to compute a perfect hash function, more than 24500 buckets would be a waste, and collisions can be ignored. > Not to mention having to scare up a proper hash function for strings > of variable length. Experiment with various functions, like the (h*33)+c function posted here recently, or a simple sum-of-characters, or a CRC...find out what works well for your case. der Mouse old: mcgill-vision!mouse new: mouse@larry.mcrcim.mcgill.edu