Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!csd4.csd.uwm.edu!bionet!agate!usenet From: hughes@math.berkeley.edu (Eric Hughes) Newsgroups: comp.software-eng Subject: Re: C source lines in file Summary: Using flex to recognize comments Message-ID: <1989Aug21.171017.27042@agate.berkeley.edu> Date: 21 Aug 89 17:10:17 GMT References: <6500@pdn.paradyne.com> <1658@naucse.UUCP> Sender: usenet@agate.berkeley.edu (USENET Administrator;;;;ZU44) Reply-To: hughes@math.berkeley.edu (Eric Hughes) Organization: UC Berkeley Math Dept Lines: 35 In-reply-to: jdc@naucse.UUCP (John Campbell) In article <1658@naucse.UUCP>, jdc@naucse (John Campbell) writes: >Anyway, here's a lex goodie I use to count comments, *exactly* what he >wanted, right? Note that the output is in lines of 'C' code, so you could >look very productive if you counted those lines of code instead! > >OBTW, this comment recognizer works well enough for my style of commenting. >It does not solve the general problem of recognizing ANSI 'C' comments with a >regular expression. A solution to that problem was posted a while back, but >it's pretty ugly... Flex, the lex replacement by Vern Paxson, has a wonderful capability to recognize comments that does not require a large ugly regexp and will not overflow the input buffer. One makes an exclusive start condition which represents the predicate "the input pointer is inside a comment." Then the start and end of comment markers can be recognized separately. This technique can also be use to recognize string and character constants, and should be for a general purpose program, to eliminate the possibility that a comment start marker appears inside a string. Eric Hughes hughes@math.berkeley.edu ucbvax!math!hughes ------------cut here------------- /* Small flex program to recognize C-style comments in text. */ %x COMMENT %% "/*" BEGIN( COMMENT ) ; . ECHO ; "*/" BEGIN( 0 ) ; "*" | [^*\n]+ | \n ; %%