Path: utzoo!attcan!uunet!decwrl!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: Recursive descent parsers in perl? Message-ID: <8563@jpl-devvax.JPL.NASA.GOV> Date: 2 Jul 90 16:47:47 GMT References: Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Distribution: comp Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 19 In article vixie@decwrl.dec.com (Paul A Vixie) writes: : Recursive-descent parsers are hard unless you can implement a "get next token" : operator. Even "get next character" is helpful. Perl doesn't have an easy : way to do either, though somewhere among the thousands of messages in Larry's : "look at it someday" mail folder, he has several from me suggesting ways to : tokenizing fairly elegantly. What's needed is something like "split" only : inside-out and backwards. Larry? The current way to write a tokener in Perl is to just start hacking tokens off the front of $_ with s/^token//. The first version of the perl debugger worked this way, before I got smart and made the perl compiler do my parsing. Substituting tokens off the front of a string is in general fairly efficient-- all it does internally is move up the pointer and decrease the length. But I've hankered for a way to tokenize better too. Whatever I do has to be general though, and I don't think even emacs-style syntax tables are smart enough. So for the moment, just keep hacking tokens off the front. Larry