Path: utzoo!utgpu!news-server.csri.toronto.edu!rutgers!usc!zaphod.mps.ohio-state.edu!van-bc!ubc-cs!phillips From: phillips@cs.ubc.ca (George Phillips) Newsgroups: comp.lang.perl Subject: Re: regexp and slice bugs Message-ID: <8561@ubc-cs.UUCP> Date: 5 Jul 90 05:01:19 GMT References: <8483@ubc-cs.UUCP> <8547@jpl-devvax.JPL.NASA.GOV> Sender: news@cs.ubc.ca Organization: University of British Columbia, Vancouver, B.C., Canada Lines: 30 In article <8547@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes: :In article <8483@ubc-cs.UUCP> phillips@cs.ubc.ca (George Phillips) writes: :: Under perl 3.0, patchlevel 18, the following script gives a wrong answer: :: :: $block = "a\nd"; :: print $block =~ /^d/; print "\n"; :: print $block =~ /^\144/; print "\n"; :: :From the manual: : : By default, the ^ character is only guaranteed to match at : the beginning of the string, the $ character only at the end : (or before the newline at the end) and perl does certain : optimizations with the assumption that the string contains : only one line. The behavior of ^ and $ on embedded newlines : will be inconsistent. Arg! My apologies for not reading the manual more closely. So the only way to ensure an anchored match in the face of arbitrary input is to do something like (/^foo/ && $` eq ""), right? It seems like a cheap hack like this could be applied internally to guarantee that ^ and $ always, always anchor a pattern match. Would it be preferable to make this the default behavior (i.e., no manual page caveats) or should it be selectable by $* = 2 or something? I think that by default ^ and $ should only ever match the beginning and the end of the string, but $* = 2 is fine by me. I'll see if I can figure out how to fix things up. George Phillips phillips@cs.ubc.ca {alberta,uw-beaver,uunet}!ubc-cs!phillips