Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!caen!uwm.edu!linac!att!ucbvax!dog.ee.lbl.gov!elf.ee.lbl.gov!torek From: torek@elf.ee.lbl.gov (Chris Torek) Newsgroups: comp.lang.c Subject: Re: Efficient STRing CoMPares? Message-ID: <11297@dog.ee.lbl.gov> Date: 22 Mar 91 00:45:12 GMT References: <1193@caslon.cs.arizona.edu> <15496@smoke.brl.mil> <1991Mar18.174207.7377@bingvaxu.cc.binghamton.edu> <15510@smoke.brl.mil> <1991Mar19.222410.1682@bingvaxu.cc.binghamton.edu> <1151@gistdev.gist.com> <11286@dog.ee.lbl.gov> Reply-To: torek@elf.ee.lbl.gov (Chris Torek) Organization: Lawrence Berkeley Laboratory, Berkeley Lines: 72 Supersedes: <11286@dog.ee.lbl.gov> X-Local-Date: Thu, 21 Mar 91 16:45:12 PST [This is a correction to a previous article, which should have been stamped out before you read it, because of the Supersedes header, but I suspect there are news sites out there that do not implement Supersedes. The correction is just an `='-for-`-' typo caused by runaway fingers. Thanks to David Barnett for catching it early.] In article <1151@gistdev.gist.com> flint@gistdev.gist.com (Flint Pellett) writes: >... I'm curious if anyone has a method that actually speeds up strcmp() >in the situtation where you have to get the lexicographic ordering >information back. For a particular application (my own personal version of Emacs, actually), I found that comparing the first two characters `inline' gave the best results. /* * Find an entry in a table. Create should be true iff we should create * a new entry if the name is not in the table. The names of any new * entries are saved with savestr(). */ struct tentry * FindEntry(t, name, create) struct table *t; register char *name; int create; { register struct tentry **ents, *te; register int i, h, m, l; if (name == NULL) return (NULL); if (t->t_sorted <= 0) SortEntries(t); ents = t->t_ents; /* binary search */ h = t->t_size - 1; l = 0; while (h >= l) { te = ents[m = (h + l) >> 1]; /* first two characters done inline for speed */ if ((i = te->te_name[0] - name[0]) == 0 && name[0] != 0) { if ((i = te->te_name[1] - name[1]) == 0) i = strcmp(te->te_name + 1, name + 1); } if (i > 0) h = m - 1; else if (i < 0) l = m + 1; else /* found */ return (te); } if (create) { if ((name = savestr(name)) != NULL && (te = insert(t, name, &ents[l])) != NULL) return (te); if (name != NULL) free(name); error("out of memory allocating entry in %s", t->t_name); } return (NULL); } SortEntries uses a clever trick: it does an insertion sort if the table is `almost' sorted, otherwise it does a quicksort. Much of this speed-hacking is due to *measured* results showing that my Emacs spent most of its `startup' time building tables and reading the current directory. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov