Path: utzoo!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: Frequently Asked Questions about Perl - with Answers [Monthly posting] Message-ID: <1991Mar13.014422.27969@jpl-devvax.jpl.nasa.gov> Date: 13 Mar 91 01:44:22 GMT References: <1991Mar08.025232.21050@convex.com> <1991Mar08.131601.23812@convex.com> <590@sunny.ucdavis.edu> <1991Mar11.234402.18685@iwarp.intel.com> Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 31 In article <1991Mar11.234402.18685@iwarp.intel.com> merlyn@iwarp.intel.com (Randal L. Schwartz) writes: : In article <590@sunny.ucdavis.edu>, poage@sunny (Tom Poage) writes: : | How about \{nn} and ${nn} as an option? : : Nopers. \{ would be special then, contrary to the design that : backslash non-alphanum is non-special. Don't break existing scripts. Righto. : But I do like the ${nn}... it seems non-ambiguous now. It's not necessary, actually. $10, $11, etc. are perfectly reasonable. Though the bracketed forms are certainly permissable (and sometimes useful). How to handle \10, \11, etc. is a little touchier. Here's how I decided to do it. Any digit sequence matching /0[0-7]{0,2}/ is automatically an octal char. Anything matching /[1-9]/ is automatically a backreference. Anything matching /[1-9]\d+/ is a backreference if there have been that many left parens so far in the regular expression; otherwise it's an octal char. This lets old scripts continue to work, since no old script has more than nine substrings. New scripts should probably stick to \010, \011, etc. to mean the corresponding octal character, so that \10 doesn't change meanings when you add the 10th set of parens. Characters like \177 are still a problem, but anybody writing patterns with THAT many substrings can afford to think about writing the character as \x7f instead. I think this arrangement will be most satisfactory all around. Larry