Path: utzoo!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!sun!plx!evan From: evan@plx.UUCP (Evan Bigall) Newsgroups: comp.unix.questions Subject: Re: YACC question Message-ID: <2210@plx.UUCP> Date: 16 Jan 90 00:31:28 GMT References: <8ZgGBgG00VsnQDgUNO@andrew.cmu.edu> Reply-To: evan@plx.UUCP (Evan Bigall) Organization: Plexus Computers; San Jose, CA Lines: 48 > > expr: mulexpr PLUS mulexpr > | mulexpr MINUS mulexpr > >It's very straightforward; the yylex() routine must be written to return >the constant PLUS when it encounters a '+' in the input, and the >constant MINUS when it encounters a '-' in the input. However, Yacc >allows you to rewrite the above fragment as > > expr: mulexpr '+' mulexpr > | mulexpr '-' mulexpr > >My question is, where does Yacc find the '+' and the '-' characters? >Apparently they're not gotten via a call to yylex(). Does Yacc simply >do a getchar()? Quoting from the yacc section of my sys5.2 "Suport Tool Guide": } The rules section is made up of one or more grammar rules. A grammar }rule has the form } }A : BODY ; } }where "A" represents a nonterminal name, and "BODY" represents a sequence of }zero or more names and LITERALS {my emphasis}. The colon and the semicolon }are yacc punctuation. {later it says:} }A literal consists of a character enclosed in single quotes ('). As in C }language, the backslash (\) is an escape character within literals.... Really all that is going on here is that yacc is using the value of the character literal as the token number. This is why the yacc generated token numbers start at 257 (on machines with ""normal"" char sets). The standard way to represent this as a lex rule is: . return(*yytext); to return a literal for all charcters not recognized by another rule. Evan -- Evan Bigall, Plexus Software, Santa Clara CA (408)982-4840 ...!sun!plx!evan "I barely have the authority to speak for myself, certainly not anybody else"