Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!decwrl!world!iecc!compilers-sender From: jimad@microsoft.UUCP (Jim ADCOCK) Newsgroups: comp.compilers Subject: Re: YACC, going the other way Keywords: yacc, testing Message-ID: <72058@microsoft.UUCP> Date: 26 Apr 91 19:35:11 GMT References: <1991Apr23.140427.5416@iecc.cambridge.ma.us> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: jimad@microsoft.UUCP (Jim ADCOCK) Organization: Microsoft Corp., Redmond WA Lines: 167 Approved: compilers@iecc.cambridge.ma.us In article <1991Apr23.140427.5416@iecc.cambridge.ma.us> elk@cblpn.att.com (Edwin Lewis King +1 614 860 3394) writes: >I'm interesting in generating strings that are described by a BNF (OK, >YACC) grammar. I don't know aobut "cleanly and efficiently", but, in theory, it's a lot easier to generate examples of strings from a grammar than to recognize them. Below is a fragment out of a program I wrote to generate "C++" - like strings [I wrote one version of the program based on Stroustrup's published grammar, and one version based on Roskind's] The only mysterious part of the fragment is the switch statements switch(W0) and switch(R5) "W0" for example is a macro that expands to randomly generate a 1/0 int -- weighted to more likely choose 0 than 1. "R5" is a macro that expands to one of the numbers 0-5, equally weighted. The real problem in randomly generating strings like this is that the production rules tend to form "feedback loops" such that some productions are much more likely to be produced repeatedly than others. Hence the hand generated "tweaks" to try to weight some cases more than others. Also, remember than generating some strings based on grammars used in C++ compilers is a very very different thing than trying to automatically generate "C++ code." Only part of the problem relates to the use of fed-back type info in C/C++. Here's some examples of such productions: .... /***********************/ struct TSH ; /***********************/ volatile & Tq :: Tm :: operator Tm * const ( ) { auto operator Tr , ih ; struct Tr extern typedef ; int io ( ( signed ) - :: new float * ++ - ( const volatile Td volatile ) short ( ~ + ( volatile ) -- sizeof ++ -- sizeof sizeof sizeof & TN ( * sizeof ( void short void .... ----------- a fragment from the producing program: .... void conditional_expression() { switch(W0) { case 0: logical_or_expression(); break; case 1: logical_or_expression(); Q(); expression(); COLON(); conditional_expression(); break; } } void logical_or_expression() { switch(W0) { case 0: logical_and_expression(); break; case 1: logical_or_expression(); OROR(); logical_and_expression(); break; } } void logical_and_expression() { switch(W0) { case 0: inclusive_or_expression(); break; case 1: logical_and_expression(); ANDAND(); inclusive_or_expression(); break; } } void inclusive_or_expression() { switch(W0) { case 0: exclusive_or_expression(); break; case 1: inclusive_or_expression(); OR(); exclusive_or_expression(); break; } } void exclusive_or_expression() { switch(W0) { case 0: and_expression(); break; case 1: exclusive_or_expression(); XOR(); and_expression(); break; } } void and_expression() { switch(W0) { case 0: equality_expression(); break; case 1: and_expression(); AND(); equality_expression(); break; } } void equality_expression() { switch(R5) { case 0: case 3: case 4: relational_expression(); break; case 1: equality_expression(); EQEQ(); relational_expression(); break; case 2: equality_expression(); NOTEQ(); relational_expression(); break; } } .... -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.