Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!bbn!jr@bbn.com
From: jr@bbn.com (John Robinson)
Newsgroups: gnu.emacs.bug
Subject: Re: replace-regexp
Message-ID: <38638@bbn.COM>
Date: 13 Apr 89 18:13:12 GMT
References: <8904130343.AA05725@starbase>
Sender: news@bbn.COM
Reply-To: jr@bbn.com (John Robinson)
Distribution: gnu
Organization: BBN Systems and Technologies Corporation, Cambridge MA
Lines: 66
In-reply-to: israel@STARBASE.MITRE.ORG (Bruce Israel)

In article <8904130343.AA05725@starbase>, israel@STARBASE (Bruce Israel) writes:
>
>   From: mailrus!sharkey!itivax!umich!zip!spencer@purdue.edu  (Spencer W. Thomas)
>
>   In some article loic%axis_d@axis.axis.fr writes:
>
>   >   (replace-regexp "^." "" ())
>   >   It empties my buffer !
>   >   It should skip to the next new-line. Don't you think so ?
>
>   Well, no.  Emacs is a character-oriented editor, NOT a line-oriented
>   editor.  Would you want it to skip to the next line after each
>   replacement if the strings were (say) "abc" and ""?  
>
>No, it shouldn't skip to the next line, but it SHOULD skip past the
>previously matched string.  i.e. replacing "foo bar" with "foo" should not
>change "foo bar bar " to "foo", it should become "foo bar".  A replacement
>should not be re-run on the results of the replacement.  Effectively, a
>replace-<whatever> should look like it found all non-overlapping occurences
>of the search item first, and then did the change second.  For a similar 
>example, should the null op (replace-string "a" "a") be an infinite loop?

But it did skip past the previously matched string.  The problem is in
the semantics of "^" in regexps.  When point is at the beginning of a
line, which side of the "^" is it on?  I think you are arguing that
before the replace starts, the point on the left of it (to pick up the
match on the first line), but after the first replace, point is on the
right of it (to avoid repetition).  I can see the logic for this
semantics.  The character-orientedness isn't really the problem, it is
the precise semantics of the non-character "^".  From info:

`^'     
     is a special character that matches the empty string, but only if at
     the beginning of a line in the text being matched.  Otherwise it fails
     to match anything.  Thus, `^foo' matches a `foo' which occurs
     at the beginning of a line.
     
This would have to change to:

`^'     
     is a special character that matches the empty string, but only if at
     the beginning of a line in the text being matched.  Otherwise it fails
     to match anything.  Thus, `^foo' matches a `foo' which occurs
     at the beginning of a line.  Once a `^' has been matched in a
     repeated search, it will fail to match the beginning of the same
     line.

Or some such.  I find it very hard to express this idea.  Maybe
something more along the lines of "when a search starts, the beginning
of each line is found and the first character on each line, if any, is
marked.  These characters are the only ones that will match in the
position following a `^' in a search pattern until the completion of
the repeated search."

I find the original concept more powerful and elegant.  Once you
change the behavior, getting the original behavior back would be a lot
more tedious as well.  You'd want a way to re-mark the beginnings of
all lines or soemthing.  Or two varieties of `^'...

Enough rambling...
--
     

/jr
jr@bbn.com or bbn!jr
C'mon big money!