Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!sdd.hp.com!think.com!mintaka!spdcc!iecc!compilers-sender
From: baxter@zola.ICS.UCI.EDU (Ira Baxter)
Newsgroups: comp.compilers
Subject: Re: Hash specifics
Keywords: design
Message-ID: <9012122107.aa06454@ICS.UCI.EDU>
Date: 13 Dec 90 05:07:17 GMT
References: <1990Dec12.221425.569@zoo.toronto.edu>
Sender: compilers-sender@iecc.cambridge.ma.us
Reply-To: Ira Baxter <baxter@zola.ICS.UCI.EDU>
Organization: Compilers Central
Lines: 24
Approved: compilers@iecc.cambridge.ma.us

In <1990Dec12.221425.569@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>Actually, I have never figured out why people seem so obsessed with perfect
>hashing, since the hash functions typically seem noticeably expensive, and
>it has always seemed to me that a fast hashing function plus fast resolution
>of occasional collisions would be easier and just as good.

As far as I can tell, perfect hashing costs extra.

If you have a lexer independent of the parsing mechanism, so that it
hasn't got any idea whether the next string is a keyword or not
(assemblers scanning the opcode field violate this requirement), then
you'll have to do a symbol/string table probe anyway, which can't be done
using a perfect hash.  So why not put the keywords into the string table
with a tag bit noting they are keywords?  The lexer now collects a
word-atom, and looks it up; if tagged, it reports the corresponding
keyword, if not, it reports a user symbol.  This scheme saves the
*additional* cost of doing a perfect hash.

--
Ira Baxter

-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.