Path: utzoo!attcan!uunet!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!decwrl!deccrl!news.crl.dec.com!decvax.dec.com!ima!iecc!compilers-sender From: eifrig@server.cs.jhu.edu (Jonathan Eifrig) Newsgroups: comp.compilers Subject: Lex and Start Conditions Summary: How can one specify a start state? Keywords: lex, question Message-ID: Date: 24 Jan 91 03:59:26 GMT Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: eifrig@server.cs.jhu.edu (Jonathan Eifrig) Organization: Compilers Central Lines: 53 Approved: compilers@iecc.cambridge.ma.us Here's a Lex usage question concerning start conditions that I encountered: I want to have my Lex-generated scanner strip out comments. Conceptually, this is easy to do with start conditions. For example: %Start NORM COM %% {natnum}+ {return(ID);} "/*" {BEGIN COM;} "*/" {BEGIN NORM;} [^] { /* Do nothing */} The idea is to have a simple little rule that eats a single character up and discards it while searching for the close-comment symbol; thereby stripping the comment out. So far, so good. This works great, except that the automaton has to be started up in the correct (meta) state (in this case, NORM). What is the best way to do this? I've found two ways of doing this, neither of which is very pretty. Option 1: Kick the automaton into the NORM state manually before parsing. This basically involves having a main() like: main() { ... BEGIN NORM; yyparse(); } Unfortunately, this requires importing the BEGIN macro into the main program file, which is sort of unappealing. Option 2: Use the undocumented INITIAL start condition. Substitute INITIAL for NORM above. Unfortuately, INITIAL isn't a "real" start condition, so we can't just BEGIN INITIAL, but have to use BEGIN 0, which is very cheezy. In addition, I have no idea how portable it is. Using "undocumented features" seems like a bad idea to me. Does anyone have any other suggestions? Jack Eifrig eifrig@cs.jhu.edu [You could use exclusive states in flex. -John] -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.