Path: utzoo!attcan!lsuc!eci386!clewis
From: clewis@eci386.uucp (Chris Lewis)
Newsgroups: comp.unix.questions
Subject: Re: lex/yacc questions from a novice...
Keywords: lex yacc
Message-ID: <1989Aug25.180538.10324@eci386.uucp>
Date: 25 Aug 89 18:05:38 GMT
References: <711@larry.sal.wisc.edu>
Reply-To: clewis@eci386.UUCP (Chris Lewis)
Organization: R. H. Lathwell Associates: Elegant Communications, Inc.
Lines: 34

In article <711@larry.sal.wisc.edu> jwp@larry.sal.wisc.edu (Jeffrey W Percival) writes:
>I am trying to use lex and yacc to help me read a dense, long,
>machine-produced listing of some crappy "special purpose" computer
>language.  I have a listing of the "rules" (grammar?) governing
>the format of each line in the listing.

>Along these lines, a problem I am having is getting the message "too
>many definitions" from lex, when all I have are a few keywords and
>ancillary definitions: (lex file included below for illustration).  Is

Generally speaking, identifying keywords directly in lex isn't worth the
bother.  Normally, (when you're writing a compiler for example), 
once you've made this decision, the tokenizing rules are pretty easy:

[A-Z][a-z0-9_]*:	word
[0-9][0-9]*:	number
+:	PLUS
-:	MINUS
+=:	PLUSEQ

In this case, you generally have lex search for "words", and once you've
caught one, you do some sort of hashed lookup in a keyword table to see 
whether it's a keyword, and return the YACC define for it, or return
"IDENTIFIER" if you couldn't find it.

Actually, I usually skip lex altogether - once you've eliminated
explicit keyword recognization, it's usually simpler (and a hell of a lot
smaller and faster) to code the analyzer in C.  Ie: 100 lines will often
do a reasonable job for C (excepting possibly floating point stuff).
-- 
Chris Lewis, R.H. Lathwell & Associates: Elegant Communications Inc.
UUCP: {uunet!mnetor, utcsri!utzoo}!lsuc!eci386!clewis
Phone: (416)-595-5425