Newsgroups: comp.lang.c Path: utzoo!henry From: henry@zoo.toronto.edu (Henry Spencer) Subject: Re: Comparing strings... (long but maybe worth it) Message-ID: <1990Oct14.095336.2819@zoo.toronto.edu> Organization: U of Toronto Zoology References: <2205.271700c2@cc.nu.oz.au> <11485@alice.att.com> Date: Sun, 14 Oct 90 09:53:36 GMT In article <11485@alice.att.com> ark@alice.att.com (Andrew Koenig) writes: > int compare (register char *p, register char *q) > { > while (*p) { > if (!*q || *p > *q) > return GREATER; > if (*p++ < *q++) > return LESS; > } > return *q? LESS: EQUAL; > } > >By this time, the program should be thoroughly incomprehensible... Some (for example, me) would say that this was aided by your reliance on implicit comparisons to zero, which hurt readability and do nothing for your stated goal of greater efficiency. Why did you bother doing that? >I imagine that further optimizations are possible... Definitely. For one thing, when coding seriously-tight versions of such things, we never make one comparison after another *inside* a loop to decide *how* to exit it; we use the shortest possible test to decide *whether* to exit it, so that the go-around-again path is as quick as possible. Examining the various tests in your loop together, the loop continuation condition is simply `*p == *q'. So we get (putting back the explicit comparison while I'm at it): while (*p != '\0' && *p == *q) p++, q++; However, we are still doing *p twice, which costs us on many machines unless our compiler is smart enough to do common-subexpression merging. We can do this explicitly, at the cost of declaring another (register!) variable: while ((c = *p) != '\0' && c == *q) p++, q++; The one obvious trick we haven't applied to this is merging the increment operators with the fetches, as in your earlier code. That actually can't be done in my first loop because of problems in the after-loop code (which I am omitting because it's 0530 and I want to wrap this up), but is viable in the second one: while ((c = *p++) != '\0' && c == *q++) continue; (the `continue' being present for readability, normally at no cost to efficiency). This is about as greased up as it's going to get without a radical change in approach. The after-loop code will need to check `c' to find out which condition terminated the loop, and will need to back up pointers and make the detailed comparisons to decide what value to return. (If one were tasteless enough to use a goto :-), the `c' re-check could be avoided, but since it's outside the loop the efficiency gain is slight.) Completing it is left as an exercise to the reader. A further optimization, details left as another optional exercise, is to define LESS, EQUAL, and GREATER in terms of sign rather than precise values, and use character subtraction to derive a suitably-signed value, avoiding a series of tests that slow most machines down somewhat and do bad things to pipelined machines, which really hate conditional branches. For extra credit, discuss how current implementations by both Berkeley and AT&T attempt this optimization and thereby introduce a bug (hint: consider the fact that char is signed on most machines). Project for the really determined reader: I said things weren't going to get much faster without a radical change in approach. Find one. (Yes, it exists.) Discuss tradeoffs and assumptions. -- "...the i860 is a wonderful source | Henry Spencer at U of Toronto Zoology of thesis topics." --Preston Briggs | henry@zoo.toronto.edu utzoo!henry