Path: utzoo!attcan!lsuc!eci386!clewis From: clewis@eci386.uucp (Chris Lewis) Newsgroups: comp.unix.questions Subject: Re: lex/yacc questions from a novice... Keywords: lex yacc Message-ID: <1989Aug25.180538.10324@eci386.uucp> Date: 25 Aug 89 18:05:38 GMT References: <711@larry.sal.wisc.edu> Reply-To: clewis@eci386.UUCP (Chris Lewis) Organization: R. H. Lathwell Associates: Elegant Communications, Inc. Lines: 34 In article <711@larry.sal.wisc.edu> jwp@larry.sal.wisc.edu (Jeffrey W Percival) writes: >I am trying to use lex and yacc to help me read a dense, long, >machine-produced listing of some crappy "special purpose" computer >language. I have a listing of the "rules" (grammar?) governing >the format of each line in the listing. >Along these lines, a problem I am having is getting the message "too >many definitions" from lex, when all I have are a few keywords and >ancillary definitions: (lex file included below for illustration). Is Generally speaking, identifying keywords directly in lex isn't worth the bother. Normally, (when you're writing a compiler for example), once you've made this decision, the tokenizing rules are pretty easy: [A-Z][a-z0-9_]*: word [0-9][0-9]*: number +: PLUS -: MINUS +=: PLUSEQ In this case, you generally have lex search for "words", and once you've caught one, you do some sort of hashed lookup in a keyword table to see whether it's a keyword, and return the YACC define for it, or return "IDENTIFIER" if you couldn't find it. Actually, I usually skip lex altogether - once you've eliminated explicit keyword recognization, it's usually simpler (and a hell of a lot smaller and faster) to code the analyzer in C. Ie: 100 lines will often do a reasonable job for C (excepting possibly floating point stuff). -- Chris Lewis, R.H. Lathwell & Associates: Elegant Communications Inc. UUCP: {uunet!mnetor, utcsri!utzoo}!lsuc!eci386!clewis Phone: (416)-595-5425