Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!eecae!netnews.upenn.edu!rutgers!att!ulysses!mhuxo!mhuxu!m10ux!mnc From: mnc@m10ux.UUCP (Michael Condict) Newsgroups: comp.lang.c Subject: Re: Want a way to strip comments from a C file Message-ID: <891@m10ux.UUCP> Date: 23 Mar 89 17:17:00 GMT References: <7150@siemens.UUCP> <880@m10ux.UUCP> <4060@ttidca.TTI.COM> Distribution: usa Organization: AT&T Bell Labs, Murray Hill Lines: 34 In article <4060@ttidca.TTI.COM>, hollombe@ttidca.TTI.COM (The Polymath) writes: > In article <880@m10ux.UUCP> mnc@m10ux.UUCP (Michael Condict) writes: > }I recently posted to this group a shell script that > } [ deletes comments from C source, among other things ] > . . . > If I understood the original posting correctly, it will also fail if it > encounters a /* or */ within a quoted string constant. E.g.: > . . . Oops, you are absolutely right. After some analysis of this limitation in my sed script, it is obvious that the regular expressions of sed (or awk or vi/ex/ed) are too limited to handle the job in any reasonable fashion. Besides the lex script that does the job is trivial. Someone pointed out that they were posting a six-line lex script to comp.sources.unix. This doesn't seem like the best way to display the solution, since the article announcing the posting was itself longer than six lines. I'll throw out the following 3-line lex script, which has been tested on all the devious ways of forming comments and quotes that I can think of. In particular, it handles comment delimiters within quotes and quotes within comment delimiters: ----------- Lex script to delete comments from C source code ---------------- %% \"([^\\"]*\\(.|\n))*[^\\"]*\" ECHO; "/*"([^*]*"*"[^/])*[^*]*"*/" ; . ECHO; ----------------------------------------------------------------------------- Can anyone find anything wrong with this one (he asks stupidly)? Can anyone find a shorter solution? Boy this is almost as much fun as computing factorial in the minimum-sized C program. -- Michael Condict {att|allegra}!m10ux!mnc AT&T Bell Labs (201)582-5911 MH 3B-416 Murray Hill, NJ