Path: utzoo!attcan!uunet!cs.utexas.edu!usc!ucsd!ucbvax!iwarp.intel.com!news From: merlyn@iwarp.intel.com (Randal Schwartz) Newsgroups: comp.lang.perl Subject: Regexps (was Re: Changing the first character of a string.) Message-ID: <1990Jul3.184351.3820@iwarp.intel.com> Date: 3 Jul 90 18:43:51 GMT References: <1990Jul3.144552.5407@uvaarpa.Virginia.EDU> Sender: news@iwarp.intel.com Reply-To: merlyn@iwarp.intel.com (Randal Schwartz) Organization: Stonehenge; netaccess via Intel, Beaverton, Oregon, USA Lines: 48 In-Reply-To: worley@compass.com (Dale Worley) In article <1990Jul3.144552.5407@uvaarpa.Virginia.EDU>, worley@compass (Dale Worley) writes: | Also, this illustrates one thing I don't like about regexps -- people | write code which depends on the order in which the alternatives are | matched. Regexps are well-defined and extremely predicatable about their "leftmost wildcard matches the most possible iterations" behavior. It's no more silly than presuming that "a" really matches "a", is it? | For instance, in the regexp above, the case where [^\0]? | matches the null string can always match, so it implicitly depends on | the fact that the non-null match is tried first. Ugh. It does this *by* *definition*. No assumption necessary. | On the other hand, | it's hard (impossible?) to write a regexp which matches in only the | right way without some way to specify context for the match (shades of | \: and \;!!!). It's probably durn near impossible, and an unnecessary burden on the part of the programmer. For example, what is the context for matching /ab.*cd/ in the string "aaababfoocdcdce"? I frequently run up against * and + matching a bit too much, and want to "back it off" a bit, but have found the problem without general solution. For example, matching the first two-digit number in a line, discarding all text before it. I want to write: s/.*(\d\d)/\1/; but instead am forced to do something like: /\d\d.*/; $_ = $&; The first expression matches the *last* occurrance of two digits. (I know... the second one discards the newline... gimme a break.) Regexps... your best friend... your worst enemy. You decide. :-) s//Just another Perl hacker,/; print -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/