Path: utzoo!attcan!uunet!ogicse!hakanson
From: hakanson@ogicse.ogc.edu (Marion Hakanson)
Newsgroups: comp.lang.perl
Subject: Re: Recursive descent parsers in perl?
Message-ID: <10292@ogicse.ogc.edu>
Date: 2 Jul 90 21:55:46 GMT
References: <YUKNGO.90Jun28100354@obelix.gaul.csd.uwo.ca> <VIXIE.90Jun30105204@volition.pa.dec.com> <8563@jpl-devvax.JPL.NASA.GOV>
Distribution: comp
Organization: Oregon Graduate Institute (formerly OGC), Beaverton, OR
Lines: 25

In article <8563@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>. . .
>The current way to write a tokener in Perl is to just start hacking tokens
>off the front of $_ with s/^token//.  The first version of the perl debugger

But if things are really complicated, you can do what I did, which is
to write a lexical analyzer in C (or lex, or flex), and have that spit
out an expression with some special delimiter between tokens, and then
use split() to break the entire expression into an array.  I found
this to work well for parsing DNS (Domain) master files, as the lex-er
dealt with quoting, continuation lines, escaped characters, etc.,
producing an equivalent simplified ("canonical") form which was easy
for Perl to split up efficiently.  My next iteration of the software
(if there is one) will likely move even more of the parsing into C (or
yacc, or whatever).

One other technique I've used is in designing the syntax of
configuration files or tables for Perl programs.  I've taken to just
specifying them using Perl syntax, so I can just "do" the files to
parse them.  If you're careful, config-file maintainers won't even
know they're writing Perl code.

-- 
Marion Hakanson         Domain: hakanson@cse.ogi.edu
                        UUCP  : {hp-pcd,tektronix}!ogicse!hakanson