Path: utzoo!news-server.csri.toronto.edu!rutgers!uwm.edu!wuarchive!usc!snorkelwacker.mit.edu!bloom-picayune.mit.edu!athena.mit.edu!jik From: jik@athena.mit.edu (Jonathan I. Kamens) Newsgroups: comp.unix.shell Subject: Re: Puzzled by A Regexp... Keywords: What's the extent of the match? Message-ID: <1991Mar5.004350.20923@athena.mit.edu> Date: 5 Mar 91 00:43:50 GMT References: <10469@ncar.ucar.edu> Sender: news@athena.mit.edu (News system) Distribution: usa Organization: Massachusetts Institute of Technology Lines: 27 In article <10469@ncar.ucar.edu>, tres@virga.rap.ucar.edu (Tres Hofmeister) writes: |> It grabs entries with one or more members, true, but also grabs |> entries with no members, e.g. "news:*:6:". I figured that this regexp |> would match the longest possible string at the beginning of a line, |> terminated by a colon, which in the group file should include the first |> two colons, followed by at least one character. It seems to be doing |> something else, given that it will also match a line with no members. Each segment of a regular expression matches the longest possible string that it can match *while allowing the rest of the regular expression to match as well*. So, let's analyze what happens when the regexp "^.*:..*" is compared to "news:*:6:". It will first match the colon in that regexp against the last colon in the string. But then it will discover that when it does that, the rest of the regexp can't be matched. So it will back off and see if "^.*:" can be matched against something shorter. As a result, the colon will get matched up with the second to last colon in the string, and the "..*" will match against "6:". I hope this clears things up for you. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710