Newsgroups: comp.lang.c Path: utzoo!utgpu!jarvis.csri.toronto.edu!csri.toronto.edu!norvell From: norvell@csri.toronto.edu (Theodore Stevens Norvell) Subject: Re: regex for C comments Message-ID: <1989Jul11.232547.13488@jarvis.csri.toronto.edu> Organization: University of Toronto, CSRI References: <19365@paris.ics.uci.edu> <502@chem.ucsd.EDU> Distribution: na In article <502@chem.ucsd.EDU> tps@chem.ucsd.edu (Tom Stockfisch) writes: >In article <19365@paris.ics.uci.edu> schmidt@zola.ics.uci.edu (Doug Schmidt) writes: >>In their book ``Introduction to Compiler Construction with UNIX,'' >>Schreiner and Friedman provide the following LEX regular expression >>for recognizing C comments: >>"/*""/"*([^*/]|[^*]"/"|"*"[^/])*"*"*"*/" > >This expression fails on each of the following: > > /*****//hello world */ Is that really a C comment? I think only the first 7 characters are. > > >So, who has the shortest single LEX expression that correctly >matches C comments -- >ignoring string and character constants, >and disallowing start conditions? > >Mine is > > "/*"\/*([^/]|{[^*/]\/+})*"*/" \/\*([^*]*\*+[^*/])*\*+\/ or more legibly "/*" ( [^*] "*"+ [^*/] )* "*"+ "/" Though I haven't proved it. Theo Norvell U of Toronto