Newsgroups: comp.lang.c Path: utzoo!henry From: henry@zoo.toronto.edu (Henry Spencer) Subject: Re: hash function for mac needed Message-ID: <1991Jun25.052133.3220@zoo.toronto.edu> Date: Tue, 25 Jun 1991 05:21:33 GMT References: <14207.285B768A@stjhmc.fidonet.org> <14396@dog.ee.lbl.gov> <567@smds.UUCP> <1991Jun19.161959.13677@zoo.toronto.edu> <3634@litchi.bbn.com> Organization: U of Toronto Zoology In article <3634@litchi.bbn.com> rsalz@bbn.com (Rich Salz) writes: >|In fact, when I originally did the hashed newsgroup-lookup code in C News >|expire, a quick investigation did not reveal any hashing function which >|had overall performance much better than simply using the length of the >|string! ... > >I think you'll want to re-check. The distributions are real uneven -- there >are more than 40 newsgroups of length 10, for example, so you have to resolve >lots of conflicts, and if you don't put a secondary hash, that means lots of >strcmp's... Strcmps cost much less if you prefix them with an inline check of the first character. This strategy isn't as effective with newsgroups (which tend to have common prefixes) as with random strings, but it still helps quite a bit. And *as I mentioned*, the above-mentioned results are out of date; the hash chains were typically 2-3 long back then. -- "We're thinking about upgrading from | Henry Spencer @ U of Toronto Zoology SunOS 4.1.1 to SunOS 3.5." | henry@zoo.toronto.edu utzoo!henry