Path: utzoo!attcan!ncrcan!scocan!ron From: ron@scocan.sco.COM (Ron Irvine) Newsgroups: comp.lang.c Subject: Hash function in C Message-ID: <1990Nov13.175844.7731@sco.com> Date: 13 Nov 90 17:58:44 GMT Sender: news@sco.com (News administration) Organization: SCO Canada, Inc. (formerly HCR Corporation) Lines: 20 Originator: ron@capri > From: djones@megatest.UUCP (Dave Jones) > > I did a little bit of experimentation with the hash-functions which > have been posted here. Nothing scientific, mind you, but very interesting. > ..... > I think I can rationalize that. The CRC-16 function does indeed > spread the set of all strings uniformly over the interval 0 .. 2**16 - 1. If you AND off the upper bits for your hash table index for short strings you are tossing valuable information in the crc-16 upper bits. The key is to preserve bits. For a 256 entry hash table try: crc = (crc^(crc>>8)) & 0xff); Note: the EOR operator does a wonderful job in preserving bits. The classic problem is finding a function that is robust enough to handle machine generated labels like "L0001, L0002" and yet able to handle "normal" strings. Anyone out there have any other hash functions of interest?