Path: utzoo!mnetor!uunet!husc6!rutgers!lll-lcc!ames!umd5!uvaarpa!mcnc!ece-csc!ncrcae!ncr-sd!hp-sdd!hplabs!hpda!hpcupt1!hpirs!wk From: wk@hpirs.HP.COM (Wayne Krone) Newsgroups: comp.bugs.sys5 Subject: Re: Bug in sed regexps ? Message-ID: <3920004@hpirs.HP.COM> Date: 28 Dec 87 23:39:27 GMT References: <9578@santra.UUCP> Organization: Hewlett Packard, Cupertino Lines: 36 The behaviour noted is correct. The apparent problem can be reduced to the first line of the sed script: sed -e 's/^\([^.]*\)[^:]*:\([^ ]*\) \1/\2 \1/' being processed against the third line of the input file: stdipc.3c:.TH STDIPC 3C "" "" HP-UX ftok \- standard ... which gives the result: .TH STDIPC 3C "" "" HP-UX ftok \- standard ... when what was wanted was no change to that line of input by that line of the sed script. The first line of the sed script was intended to operate on patterns such as: <.><:> and so it was expected that line 3 of the input file would not be processed because the obvious match for of "stdipc" did not appear a second time in the input line after a . However, based upon the regular expression, the non-obvious match for is zero characters (the NULL string) and as a zero length pattern does match after the . Stating the problem another way, while the regular expression establishes that can not extend past the first ".", it fails to prevent from matching characters before the first ".". The solution, as you have already noted, is to establish an explicit boundary between the and expressions. Wayne Krone Hewlett-Packard