Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!wuarchive!zaphod.mps.ohio-state.edu!think.com!linus!linus!gong!lamour From: lamour@gong.mitre.org (Michael Lamoureux) Newsgroups: comp.lang.perl Subject: Re: pattern matching question Message-ID: <1991May24.133427.4721@linus.mitre.org> Date: 24 May 91 13:34:27 GMT References: <1991May22.193037.12166@cherokee.uswest.com> <1991May22.221507.12660@cherokee.uswest.com> <49420@ut-emx.uucp> Sender: news@linus.mitre.org (News Service) Reply-To: lamour@mitre.org Organization: The MITRE Corporation, McLean, Va Lines: 61 Nntp-Posting-Host: gong.mitre.org In article , sherman@unx.sas.com (Chris Sherman) writes: |> In <49420@ut-emx.uucp> dboles@ccwf.cc.utexas.edu (David Boles) writes: |> |> >Warning: PERL NOVICE approaching !!! |> |> Warning: PERL NOVICE answering!!! Ditto. (But I am avidly reading the book...) |> >I am using: |> >while (<>) { |> > s/(p\d*) (p\d*) (\d*) (\d*)/$1 -$3 -$4\n$2 $3 $4\n/; |> > print; |> >} |> >Why aren't $3 and $4 "alive" in the first half of the replacement |> >string? What am I missing? |> |> I think I got it. Perl is taking your test string literally, space |> for space. This is exactly it. |> #!/usr/local/bin/perl |> while (<>) { |> s/(p\d*) *(p\d*) *(\d*) *(\d*)/$1 -$3 -$4\n$2 $3 $4\n/; |> print; |> } |> |> Maybe perl pro's can tell me what the '*'s meant exactly, why |> they are working, and if they would work in every case. |> I was hoping to set up a one-or-more-number-of |> spaces type thing, but I don't think I did that right). Well, your expression tests for 0 or more spaces. An "*" tests for 0 or more occurences, a "+" tests for 1 or more. So using "+" instead of "*" would fix that, but I think using a "\s" instead of a " " would be more multi-purpose. This matches any whitespace, not just a space. So I guess it should look like this: while (<>) { s/(p\d+)\s+(p\d+)\s+(\d+)\s+(\d+)/$1 -$3 -$4\n$2 $3 $4\n/; print; } This matches the p's only if they have numbers appended as well. So even better... while (<>) { if (/.*(p\d+)\s+(p\d+)\s+(\d+)\s+(\d+).*/) { print "$1 -$3 -$4\n$2 $3 $4\n"; } } This allows you to put comments or something else in the file and only prints out the strings which match (and drops typos...you may want to change this expression a bit and flag errors with an else). Note that "." matches any character. Michael lamour@mitre.org Disclaimer: Perl is addictive ;-)