Xref: utzoo comp.unix.questions:25297 comp.lang.misc:5460 gnu.misc.discuss:1743
Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!uakari.primate.wisc.edu!crdgw1!underdog!volpe
From: volpe@underdog.crd.ge.com (Christopher R Volpe)
Newsgroups: comp.unix.questions,comp.lang.misc,gnu.misc.discuss
Subject: Re: Problem with YACC and LEX (Bison and Flex)
Keywords: yacc lex bison flex
Message-ID: <11700@crdgw1.crd.ge.com>
Date: 7 Sep 90 17:59:10 GMT
References: <9138@potomac.ads.com>
Sender: news@crdgw1.crd.ge.com
Reply-To: volpe@underdog.crd.ge.com (Christopher R Volpe)
Followup-To: comp.unix.questions
Lines: 73

In article <9138@potomac.ads.com>, jtn@potomac.ads.com (John T. Nelson) writes:
|>I've written a grammar to recognize the following input
|>sequence:
|>
|>A = \n
|>	B \n
|>	C \n
|>	D \n
|>E = \n
|>	F \n
|>	G \n
|>	H \n
|>The grammar looks a lot like this:
|>SECTION
|>	:
|>	NAME	SPECS
|>|
|>	SECTION		NAME	SPECS
|>	;
|>NAME
|>	:
|>	TOKEN_WORD	TOKEN_EQUAL_SIGN	OPTIONAL_THING	LINE_FEED
|>|
|>	NUMBER	TOKEN_EQUAL_SIGN	OPTIONAL_THING	LINE_FEED
|>	;
|>SPECS
|>	:
|>	SPEC
|>|
|>	SPECS	SPEC
|>	;
|>SPEC
|>	:
|>	TOKEN_WORD	OPTIONAL_THING	LINE_FEED
|>|
|>	NUMBER	OPTIONAL_THING	LINE_FEED
|>	;
|>OPTIONAL_THING
|>	:
|>|
|>	TOKEN_SPECIAL_SYMBOL
|>	;
|>Now the Bison processor does indeed catch a bunch of shift/reduce and 2
|>reduce/reduce conflicts however the "TOKEN_EQUAL_SIGN" in the NAME rule
|>should be enough to uniquify it from the "SPEC" rules and that's what
|>bugs me..... Bison is unable to make a decision on the string when the
|>token that resolves the ambiguity is one or perhaps two tokens ahead
|>in the stream.

The problem isn't that the grammar is ambiguous, because it's not.
The problem is that the grammar is not LALR(1), as yacc (and probably
Bison) require. When the lookahead symbol is a TOKEN_WORD or a NUMBER
(after just having reduced to SPEC and then SPECS), it can't tell
whether it should reduce to SECTION or shift the current input symbol.
You would need an LALR(2) parser to do that.                            
I'm not sure if this will work, but you might want to try making
the SECTION rule right-recursive. This will consume MUCHO stack space
during the parse because the reductions aren't done until the end of the
input is reached. Therefore, the stack requirement is proportional
to the size of the input, rather than constant, as would be the
case with left-recursion (because the reductions are done at each
step). So, if you have stack space to burn, try the following: (again,
I'm not sure it will work anyway)
SECTION
       :
       NAME SPECS
   |    NAME SPECS SECTION

Hope this helps...
==================
Chris Volpe
G.E. Corporate R&D
volpecr@crd.ge.com