Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!wuarchive!zaphod.mps.ohio-state.edu!sdd.hp.com!elroy.jpl.nasa.gov!ames!haven!decuac!e2big.mko.dec.com!bacchus.pa.dec.com!decwrl!uunet!ontek!mikey From: mikey@ontek.com (krill are yummy and krunchy) Newsgroups: comp.lang.c Subject: Re: Wanted -- ambiguous lookup routine. Message-ID: <1260@ontek.com> Date: 10 Aug 90 18:34:13 GMT References: <1990Aug9.082839.3663@ifi.uio.no> Organization: Ontek Corporation -- Laguna Hills, California Lines: 40 In comp.lang.c and comp.sources.wanted, gisle@ifi.uio.no (Gisle Hannemyr) writes: | | I want to look up a string using an unambiguous abbreviation of the words | in the string. | | Does anyone have a fast C routine that do the above, or know a | clever algorithm to accomplish it. My main requirement is speed. The algorithm for determining whether the strings match is trivially simple: loop over calls to strchr(), where the first parameter steps through the "long" string and the second steps through the "short" string. If the short string is consumed first, there is a match; if the long string is consumed first, there is no match. Naturally, if the "long" string has fewer characters than the "short" string, there can be no match. The issue of ambiguity is best solved by maintaining a list of all the matches and then presenting that list to the user and letting him or her choose. A simpler and less user-friendly solution would be to simply count the matches as you go and abort when the count reaches two. As for speed of execution, the above algorithm should work suitably fast if the list is really in RAM. If the list is so big that most of it is sitting on a swap device somewhere, the disk access time may overshadow the time spent checking the strings themselves. A solution I've often adopted is to add one more constraint to the search: the first character of the two strings must match exactly. Then sort the list and do a binary search on the first char and scan upwards for matches giving up when you get to the next letter of the ascii alphabet. | Please mail your replies if possible. I do not read these groups | regularly. I feel like posting it anyway. Phblt. (pbuh) love, mikey