Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site hcrvx1.UUCP Path: utzoo!hcrvx1!hugh From: hugh@hcrvx1.UUCP (Hugh Redelmeier) Newsgroups: net.emacs Subject: Retrospective Pattern Matching Message-ID: <1418@hcrvx1.UUCP> Date: Mon, 3-Nov-86 14:45:57 EST Article-I.D.: hcrvx1.1418 Posted: Mon Nov 3 14:45:57 1986 Date-Received: Tue, 4-Nov-86 01:57:20 EST Organization: Human Computing Resources, Toronto Lines: 81 Some Pitfalls of Retrospective Patterns in String Substitution ============================================================== A number of text editors have substitute commands that will match patterns. This note discusses a problem that arises if patterns can "look back", that is, depend on context preceding the current point. In passing, a problem with null strings is discussed too. A "retrospective pattern" is one that can look back. In QED for the GE 600 machines, and its derivatives (notably UNIX ed and vi), the caret (^) means "match a null string at the start of a line". This pattern is retrospective in that it depends on what precedes the current position. Similarly, the University of Toronto version of ed has patterns for beginning of word (\{) and end of word (\}), and these depend on what precedes the current position (although not acknowledged, this code found its way into vi for the \< and \> operators). A pattern element that would match a specific column would also be retrospective. Many other retrospective patterns could be invented, but at least these ones exist and have proven to be useful. The problem with retrospective patterns arises when they are used in a global replacement command. How should a match be affected by a previous replacement performed by the same command? For concreteness, consider the vi command s/\ ", use the ed command: s/^/> / It is possible to get into an infinite loop with a naively implemented substitute command. The following command will cause ed to loop. The strange pattern is the simplest way of writing a null pattern. s/\(\)//g The next version will cause an error due to the line growing too long: s/\(\)/x/g The solution I have implemented is to force a search for a pattern that last matched a null string to resume one character farther on. Intuitively, this means that each null string in the subject is matched only once. This rule treats a pattern that has matched a null string as if it had a retrospective operator under the new JOVE rule. Hugh Redelmeier (416) 922-1937 utzoo!hcr!hugh