Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!samsung!think.com!mintaka!bloom-beacon!eru!hagbard!sunic!mcsun!unido!mikros!mwtech!martin From: martin@mwtech.UUCP (Martin Weitzel) Newsgroups: comp.lang.c Subject: Re: two (or more) lex's/yacc's in one executable Message-ID: <993@mwtech.UUCP> Date: 10 Dec 90 15:49:46 GMT References: <1990Dec6.200944.13037@cs.columbia.edu> <14674@smoke.brl.mil> Reply-To: martin@mwtech.UUCP (Martin Weitzel) Organization: MIKROS Systemware, Darmstadt/W-Germany Lines: 94 In article <14674@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: >In article <1990Dec6.200944.13037@cs.columbia.edu>, leland@cs writes: >- I've tried this kludge: create a header file that re-#define's all the >- names 'yyfoo' in lex/yacc set #1 to be named, say, set1yyfoo, and all >- those in set #2 to be named set2yyfoo. This has worked for me in the >- past, but won't in this particular instance because the generated >- code includes calls to yyless() and yywrap(), which are in the LEX >- library (-ll), the contents of which I cannot rename. So that doesn't >- work. > >But it almost does -- Since "lex" produces C source, you can #define >set1yyless yyless, etc. before the lex output to be compiled, thereby >turning these selected reference back into calls to the shared library >functions. (I assume the lex library does not maintain internal state.) Unfortunately things are more complicated. Here is an excerpt from `nm /usr/lib/libl.a' (UNIX Sys V): ---------------------------------------------------------------------- Symbols from /usr/lib/libl.a[reject.o]: Name Value Class Type Size Line Section reject.c | | file | | | | yyreject | 0|extern| int( )| 270| |.text yyracc | 272|extern| int( )| 154| |.text yyinput | 0|extern| | | | yyleng | 0|extern| | | | yytext | 0|extern| | | | yylsp | 0|extern| | | | yyolsp | 0|extern| | | | yyfnd | 0|extern| | | | yyunput | 0|extern| | | | yylstate | 0|extern| | | | yyprevious | 0|extern| | | | yyoutput | 0|extern| | | | yyextra | 0|extern| | | | yyback | 0|extern| | | | Symbols from /usr/lib/libl.a[yyless.o]: Name Value Class Type Size Line Section yyless.c | | file | | | | yyless | 0|extern| int( )| 107| |.text yyleng | 0|extern| | | | yytext | 0|extern| | | | yyunput | 0|extern| | | | yyprevious | 0|extern| | | | Symbols from /usr/lib/libl.a[yywrap.o]: Name Value Class Type Size Line Section yywrap.c | | file | | | | yywrap | 0|extern| int( )| 16| |.text ---------------------------------------------------------------------- The problem is not some internal state of these functions, but that they expect a number of external `yyfoo'-symbols, and there is no way to make them access the `right' ones without rewriting the functions. So, how hard would it be to rewrite them? The trivial case is `yywrap'. I hope AT&T doesn't sue me because of reverse engineering :-), but this function is a one-liner. yywrap() { return 1; } The two other functions (`yyless' and `yywrap') may have complicated interactions with a lot of globals, so the best solution is to avoid them and do manually what is required. This is simple in case of `yyless', since it is usually used to push back parts of `yytext' to the input stream. This can also be done by with `unput()'-macro in a loop (The library version of `yyless' does this via the `yyunput()'-function but this function simply calls `unput()' which may have been redefined - have a look into `lex.yy.c' to understand how things work together.) In addition the "original" `yyless' adjusts `yytext' and `yyleng' accordingly. The part that still worries me is the reference to `yyprevious' within `yyless'. To be sure, you should probably disassemble the library version of `yyless' - it's not that large. ``yyreject' should best be completly avoided because it plays with a lot of external symbols (the poster of the original question is lucky here, but others may understand this as a hint to use REJECT - which in turn calls yyreject() - only as a last resort). BTW: Another option is to have a common lexer for both sets of input symbols and use start conditions in lex to select the appropriate ones. It's a pitty that start conditions are insufficiently explained in the common documentation of lex (if they are mentioned at all). -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83