Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!sdd.hp.com!think.com!mintaka!spdcc!iecc!compilers-sender From: baxter@zola.ICS.UCI.EDU (Ira Baxter) Newsgroups: comp.compilers Subject: Re: Hash specifics Keywords: design Message-ID: <9012122107.aa06454@ICS.UCI.EDU> Date: 13 Dec 90 05:07:17 GMT References: <1990Dec12.221425.569@zoo.toronto.edu> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: Ira Baxter Organization: Compilers Central Lines: 24 Approved: compilers@iecc.cambridge.ma.us In <1990Dec12.221425.569@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >Actually, I have never figured out why people seem so obsessed with perfect >hashing, since the hash functions typically seem noticeably expensive, and >it has always seemed to me that a fast hashing function plus fast resolution >of occasional collisions would be easier and just as good. As far as I can tell, perfect hashing costs extra. If you have a lexer independent of the parsing mechanism, so that it hasn't got any idea whether the next string is a keyword or not (assemblers scanning the opcode field violate this requirement), then you'll have to do a symbol/string table probe anyway, which can't be done using a perfect hash. So why not put the keywords into the string table with a tag bit noting they are keywords? The lexer now collects a word-atom, and looks it up; if tagged, it reports the corresponding keyword, if not, it reports a user symbol. This scheme saves the *additional* cost of doing a perfect hash. -- Ira Baxter -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.