Path: utzoo!utgpu!watmath!uunet!mcsun!sunic!tut!tukki!tarvaine From: tarvaine@tukki.jyu.fi (Tapani Tarvainen) Newsgroups: gnu.utils.bug Subject: Re: Grep bug? Summary: Bug! Patch included. Message-ID: <1309@tukki.jyu.fi> Date: 16 Sep 89 09:10:41 GMT References: <8909121817.AA11159@gem> Reply-To: tarvaine@tukki.jyu.fi (Tapani Tarvainen) Distribution: gnu Organization: University of Jyvaskyla, Finland Lines: 75 In article <8909121817.AA11159@gem> lang@PRC.UNISYS.COM writes: ... >% grep -i '\Wx\W' foo > >But the effect here is that the -i flag makes the '\W' >meta-character non-case-sensitive as well I would call this a bug (and patched it). I traced the problem to the following piece of code in dfa.c: /* Parse and analyze a single string of the given length. */ void regcompile(s, len, r, searchflag) const char *s; size_t len; struct regexp *r; int searchflag; { if (case_fold) /* dummy folding in service of regmust() */ { static char *p; case_fold = 0; for (p = (char *)s; *p != 0; p++) if (isupper((int)*p)) *p = tolower((int) *p); ... I.e., when the -i flag is given, it folds the entire regexp to lower case before doing anything else with it. I failed to find any reason for this: the search routines handle case folding on their own anyway, and removing the above loop didn't seem to have any other effect than removing the undesired effect of -i flag on \W (and \B). Does somebody know if the folding loop is necessary in some other program using dfa.c (or do they maybe have the same bug)? If yes, the following should work (avoids changing letters after a \): for (p = (char *)s; *p != 0; p++) if (isupper((int)*p)) *p = tolower((int) *p); else if (*p == '\\' && *(p+1)) p++; As far as e?grep is concerned, however, just removing the loop seems to work just fine (the declaration of p can be removed as well). Here's a context diff for just that (actually it #if's them out rather than deletes them and adds a comment): *** dfa.old Sat Sep 16 12:04:32 1989 --- dfa.c Sat Sep 16 12:04:34 1989 *************** *** 1668,1679 **** --- 1668,1685 ---- { if (case_fold) /* dummy folding in service of regmust() */ { + /* the following two #if 0's added by Tapani Tarvainen 16 Sep 89 */ + /* to prevent -i flag from affecting \W and \B in e?grep */ + #if 0 static char *p; + #endif case_fold = 0; + #if 0 for (p = (char *)s; *p != 0; p++) if (isupper((int)*p)) *p = tolower((int) *p); + #endif reginit(r); r->mustn = 0; r->must[0] = '\0'; -- Tapani Tarvainen (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)