Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!rutgers!mit-eddie!bloom-beacon!mit-hermes!iuvax!pur-ee!uiucdcs!uiucdcsb!kenny From: kenny@uiucdcsb.cs.uiuc.edu Newsgroups: comp.unix.questions Subject: Re: Any LEX gurus? Message-ID: <166400005@uiucdcsb> Date: Sat, 8-Aug-87 18:05:00 EDT Article-I.D.: uiucdcsb.166400005 Posted: Sat Aug 8 18:05:00 1987 Date-Received: Sun, 9-Aug-87 13:38:14 EDT References: <716@umnd-cs.D.UMN.EDU> Lines: 52 Nf-ID: #R:umnd-cs.D.UMN.EDU:716:uiucdcsb:166400005:000:2113 Nf-From: uiucdcsb.cs.uiuc.edu!kenny Aug 8 17:05:00 1987 /* Written 4:09 pm Aug 7, 1987 by jwabik@cs.D.UMN.EDU in uiucdcsb:comp.unix.questions */ /* ---------- "Any LEX gurus?" ---------- */ [Story of the woes trying to code a LEX expression that will accept newlines, wanting to match preprocessor directives that look like: ?? --stuff-possibly-containing-newlines-- ?? ] The problem here is that (as you'll note) this expression cannot cope with comments that contain newlines (which is fine, since neither can the compiler 8^), and for the LIFE of me, I CANNOT write an expression that WILL accept similar preprocessor directives that NEED to contain newlines. /* End of text from uiucdcsb:comp.unix.questions */ You COULD code a single regular expression to match all this stuff, but it would be (1) really awkward, and (2) likely to overrun LEX's token buffer trying to absorb your preprocessor directives. A better plan is to use lex's syntax for left context sensitivity. Your file will begin with something like: %Start DIRECTIVE %% "??" { BEGIN DIRECTIVE; } "??" { BEGIN INITIAL; } .|"\n" ; which tells lex the following: There is a new scanner state called DIRECTIVE state. When you see the string "??" in the initial state, go to DIRECTIVE state. If you see the string "??" in the DIRECTIVE state, go back to the initial state. Any other characters in the DIRECTIVE state, including newlines, are ignored. All the rest of your rules will have to have placed in front of them, to keep them from firing in DIRECTIVE state. For example, a complete LEX program to strip preprocessor directives in your format and place the remainder of the program on stdout would consist of the above, followed by the single line .|"\n" ECHO; which says, ``echo any single character, including a newline, seen in the INITIAL state to stdout.'' Kevin Kenny UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kenny Department of Computer Science ARPA: kenny@B.CS.UIUC.EDU (kenny@UIUC.ARPA) University of Illinois CSNET: kenny@UIUC.CSNET 1304 W. Springfield Ave. Urbana, Illinois, 61801 Voice: (217) 333-8740