Newsgroups: comp.lang.c Path: utzoo!sq!lee From: lee@sq.sq.com (Liam R. E. Quin) Subject: Re: compare strings, strcmp Message-ID: <1989Nov21.015141.18792@sq.sq.com> Keywords: STREQ, qsort Reply-To: lee@sq.com (Liam R. E. Quin) Organization: Unixsys (UK) Ltd References: <309@charyb.COM> <1989Nov17.155143.2758@aqdata.uucp> Date: Tue, 21 Nov 89 01:51:41 GMT Some general thoughts on STREQ, sorting, arrays and cranberry sauce... Doug Gwyn mentions >>> #define StrEq(a,b) (*(a) == *(b) && strcmp(a, b) == 0) /*UNSAFE*/ and comments it as UNSAFE... I first saw this as a Henry Spencer-ism, and hence generally write #define STREQ(henry, utzoo) (*(henry) == *(utzoo) && !strcmp(henry, utzoo)) (this has the same side-effect problems, but is a little more fun). For the original question, about sorting a large "in-core" array, the 4.3 BSD qsort routine is publicly available by uucp and ftp, and was on the Tahoe release. I mention this because it has much better behaviour in some cases, tending not to recurse as deeply as the older ones. strcmp() is usually quite fast, but if you are calling it a lot of times and this is a problem, at the loss of some generality and flexibility I suggest you take the BSD qsort() (or write your own) and WIRE IN the compare function to be the STREQ macro. Otherwise, STREQ does not help at all, as you have to give qsort(3) a function, not a macro. You might also want to think about using a different sort algorithm altogether. One that sorts one part of the large array and then goes on to look at another may take more swaps/compares, but will exhibit greater locality of reference -- in other words, will cause fewer page faults! This might be *much* more significant! Of course, if the strings themselves came from malloc individually, you can't predict where they are (in general) and this won't work, but I seem to remember seeing that you had an array char WhateverItWasCalled[1000][200]; implying a thousand strings each of exactly 200 characters. If the strings are in practice generally shorter, you may get a big win by calling malloc for each of them, possibly generating them in sorted order (for example using a tree or linked list). That way, you don't use up so much storage, and things will probably work out faster in the end. 1,000 calls to malloc doesn't seem to take very long -- especially if you have the newer System V malloc(3X). Hope this helps. Lee (I lied about the cranberry sauce) -- Liam R. Quin, Unixsys (UK) Ltd [note: not an employee of "sq" - a visitor!] lee@sq.com (Whilst visiting Canada from England) lee@anduk.co.uk (Upon my return to England at Christmas)