Xref: utzoo comp.lang.c:29289 comp.unix.wizards:22259 Path: utzoo!attcan!uunet!aplcen!samsung!emory!emcard!gatech!galbp!samna!jeff From: jeff@samna.UUCP (Jeff Barber) Newsgroups: comp.lang.c,comp.unix.wizards Subject: Re: Lex and initial start conditions Message-ID: <254@samna.UUCP> Date: 1 Jun 90 15:58:02 GMT References: <6342@crabcake> <1990May30.174745.1161@csrd.uiuc.edu> Reply-To: jeff@samna.UUCP (Jeff Barber) Followup-To: comp.unix.wizards Organization: Draughtsman's Contractors Lines: 65 In article <1990May30.174745.1161@csrd.uiuc.edu> pommu@iis.ethz.ch (Claude Pommerell) writes: >However, if you put such an insertion text after "%%" (in the rules >section of your >Lex source), it gets inserted at the start of the body of the function >that performs >the lexical analysis, so you can use it to specify an initial condition. That's okay for this particular situation. But it won't work if your lex program is a lexical analyzer in a larger program. Your placement of the "BEGIN start-symbol;" after the first %% causes it to be included at the beginning of the yylex() function. This means that every time you call the lexical analyzer for a new token, its state gets reset. If your actions are designed to return a token to a parser (a yacc program, for example), they'll contain statements like: return TOK_IDENTIFIER; So, a better general purpose solution is to define some function after the *second* %% which contains the BEGIN statement and is called to initialize the analyzer. In your case, we can just create a main() function with the BEGIN in it (You've also got some unnecessary states in here, so I've simplified a bit): --------------------Cut Here---------------------------- %{ /* context in recursive C-like comments */ static int commentLevel = 0; %} /* Starting conditions to support recursive C-like comments */ %START Text InCCom %% \/\* { ++commentLevel; BEGIN InCCom; } \*\/ { if (--commentLevel == 0) BEGIN Text; } \*\/ { printf("Syntax error\n"); exit(1); } . | \n { /* Ignore stuff inside of comments everything else echoed by default. */ } %% main(ac, av) char **av; { /* Set the initial condition */ BEGIN Text; return yylex(); } --------------------Cut Here---------------------------- One last thing, it is possible to utter the name of the initial state ("INITIAL") so that if INITIAL were substituted for Text, no state initialization would be necessary (our main() function wouldn't be either; it would be supplied by the lex library [ cc ... -ll ]). (BTW, anybody know whether this is portable - I don't recall reading about this INITIAL state in the documentation; I just noticed it in the lex.yy.c output and discovered by experimentation that lex recognizes it in a rule). I've directed followups out of comp.lang.c. Jeff