Newsgroups: comp.compilers Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sdd.hp.com!decwrl!world!iecc!compilers-sender From: ressler@cs.cornell.edu (Gene Ressler) Subject: Re: YACC, going the other way Message-ID: <1991May1.022140.28194@cs.cornell.edu> Followup-To: ressler@cs.cornell.edu Summary: Deterministic enumeration of CFLs Keywords: yacc, testing, Lisp, theory Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: ressler@cs.cornell.edu (Gene Ressler) Organization: Cornell Univ. CS Dept, Ithaca NY 14853 References: <1991Apr23.140427.5416@iecc.cambridge.ma.us> <72058@microsoft.UUCP> Date: Wed, 1 May 1991 02:21:40 GMT Approved: compilers@iecc.cambridge.ma.us In article <1991Apr23.140427.5416@iecc.cambridge.ma.us> elk@cblpn.att.com (Edwin Lewis King +1 614 860 3394) writes: >I'm interesting in generating strings that are described by a BNF (OK, >YACC) grammar. Many decidablility results rely on Turing machines enumerating CFLs, so we'd better be able to do it without random numbers! I know this may not be what you want for testing, but I think it's interesting anyway. Following is a rough hack in Common Lisp (runs under Lucid) that keeps a queue of sentential forms sorted by length, pulling off the shortest one to expand next. `Expand' means replace each non-terminal A by the rhs of each A-production in all combinations. If a resulting form has only non-terminals, print it; otherwise queue it for deeper expansion. If you have no epsilon productions, this generates strings of strictly non-decreasing length. Gene ---- ; Enumerate strings generated by a CFL. ; Warning: Rough and probably buggy code. (in-package 'user) (let ((look-ahead nil) (stream t) (rules (make-hash-table)) (start nil) (q nil)) (defmacro lex () `(setq look-ahead (read stream nil))) (defun parse (in-stream) ; set up (clrhash rules) (setq stream in-stream) (lex) ; assume first sym is start sym (setq start look-ahead) (loop ; done when lex returns nil for eof (unless look-ahead (return)) ; look ahead is lhs. gather rhs. (let ((prod look-ahead) (rhs nil)) (lex) (unless (eq look-ahead '->) (error "missing ->")) (lex) (loop ; gather up to ! or !! (`or' or end of prod) (loop (when (member look-ahead '(! !! ->)) (return)) (push look-ahead rhs) (lex)) ; put production in rules hash table indexed by lhs ; (prod) so each entry is a list of rhs's (push (reverse rhs) (gethash prod rules)) (case look-ahead (!! (lex) (return)) (! (lex)) (t (error "unexpected ~A" look-ahead))) ; start new rhs for same lhs (prod) (setq rhs nil)))) ; check for undefined symbols (maphash #'(lambda (prod rhss) (declare (ignore prod)) (dolist (rhs rhss) (dolist (x rhs) (unless (or (stringp x) (gethash x rules)) (error "~A undefined" rhs))))) rules) ; return start symbol start) ; don't put any sentential form longer than this on the queue (defparameter *cutoff-length* 10) ; insert form in queue sorted ascending by length (defun enq (s-form) (when (< (list-length s-form) *cutoff-length*) (setq q (merge 'list (list s-form) q #'(lambda (x y) (< (list-length x) (list-length y))))))) ; get shortest sentential form from queue (defmacro qpop () `(pop q)) ; expand rhs every way that is possible by ; expanding each non-terminal exactly once. ; accumulate result in `sofar'. when rhs ; is gone, look at `sofar' to see if it's all ; terminals (strings). if so, print it; if not, ; queue it for deeper expansion. (defun expand (rhs sofar) (cond ((null rhs) (if (every #'stringp sofar) (format t "~&~A" (reduce #'(lambda (x y) (concatenate 'string x y)) (reverse sofar))) (enq (reverse sofar)))) ((stringp (car rhs)) (expand (cdr rhs) (cons (car rhs) sofar))) (t (dolist (rrhs (gethash (car rhs) rules)) (expand (cdr rhs) (revappend rrhs sofar)))))) ; expand start symbol, iterate (expand) ; until the queue is empty. (defun enum (start) (setq q nil) (dolist (rhs (gethash start rules)) (expand rhs nil)) (loop (let ((rhs (qpop))) (unless rhs (return)) (expand rhs nil)))) ) ; end (let (look-ahead ... ; tester assumes a grammar is in file "grammar" (defun test () (with-open-file (grammar "grammar" :direction :input) (enum (parse grammar)))) ; a sample grammar. ; if you need epsilon, just say "". #| S -> expr !! expr -> term ! expr "+" term !! term -> factor ! term "*" factor !! factor -> "1" ! "a" ! "(" expr ")" !! |# -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.