Newsgroups: comp.lang.c
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!think.com!snorkelwacker.mit.edu!bloom-picayune.mit.edu!news
From: scs@adam.mit.edu (Steve Summit)
Subject: Re: Efficient STRing CoMPares?
Message-ID: <1991Mar20.235613.20480@athena.mit.edu>
Summary: symbol table hashing
Sender: news@athena.mit.edu (News system)
Reply-To: scs@adam.mit.edu
Organization: Massachusetts Institute of Technology
References: <1991Mar20.020957.9180@athena.mit.edu> <1991Mar20.213615.11223@noose.ecn.purdue.edu>
Date: Wed, 20 Mar 91 23:56:13 GMT
Lines: 64

In article <1991Mar20.020957.9180@athena.mit.edu>, I wrote:
>You can also write great little symbol table implementations
>using hashing, and the hash values don't even have to take up
>extra space (they're implicit in the bucket or slot indices).

In article <1991Mar20.213615.11223@noose.ecn.purdue.edu> psmith@iies.ecn.purdue.edu (Paul F Smith) writes:
>...could you 
>please elaborate on how your symbol table works. I just implemented a
>hash table that keeps the hash value around (like your struct above), 
>which I then use in scanning the list at the given bucket to minimize
>strcmp()s.  But, I can't see how you can get rid of it and still reduce 
>the number of strcmp()s. Please describe, Thanks!

Well, I didn't say the number of strcmp's was reduced to 1.
Hashing has done its work (i.e. eliminated a lot of strcmp's and
associated searching) when it has computed the right bucket to
look in; there aren't supposed to be large numbers of collisions
to scan over.  I just call strcmp for each entry in the bucket.
The code is tiny, so I'll show it:

struct symtabent
	{
	char *st_name;
	struct type *st_type;
	somethingorother st_value;
	struct symtabent *st_next;
	};

struct symtab
	{
	int st_hashsize;
	struct symtabent *st_hashtab[1];	/* dynamically extended */
			/* ("unwarranted chumminess with the compiler") */
	};

stinsert(step, stp)
struct symtabent *step;
struct symtab *stp;
{
unsigned int h = hash(step->st_name) % stp->st_hashsize;
/* simpleminded; doesn't check for duplicates */
step->st_next = stp->st_hashtab[h];
stp->st_hashtab[h] = step;
}

struct symtabent *
stlookup(name, stp)
char *name;
struct symtab *stp;
{
unsigned int h = hash(name) % stp->st_hashsize;
register struct symtabent *step;

for(step = stp->st_hashtab[h]; step != NULL; step = step->st_next)
	{
	if(strcmp(step->st_name, name) == 0)
		return step;
	}

return NULL;
}

                                            Steve Summit
                                            scs@adam.mit.edu