Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!ut-sally!husc6!necntc!custom!boykin From: boykin@custom.UUCP (Joseph Boykin) Newsgroups: comp.unix.questions Subject: Re: sed - match newlines on input Message-ID: <572@custom.UUCP> Date: Mon, 9-Mar-87 23:49:19 EST Article-I.D.: custom.572 Posted: Mon Mar 9 23:49:19 1987 Date-Received: Tue, 10-Mar-87 19:44:32 EST References: <570@hao.UCAR.EDU> Organization: Custom Software Systems; Natick, MA Lines: 59 Summary: Embedded newlines are not what you think! In article <570@hao.UCAR.EDU>, bill@hao.UCAR.EDU (Bill Roberts) writes: > I'm trying to match a pattern over multiple lines. > For instance, on the input: > > one > two > three > > with the sed script: > > s/one\ntwo\nthree/one, two, three/g > one would expect to get the following output > one, two, three > > OK, so I don't understand the manual (what else is new). How can I get what I > need? Also, what about the "Multiple Input-line Functions". Might that be the > way to go? An example would really help. Thanks in advance. > > Bill Roberts > NCAR/HAO > Boulder,CO The documentation on this point within the SED documentation is definately confusing, when we did the documentation for PC/SED, we tried to make it better, but I don't think we did! Okay, here goes: SED's regular expression handler was modified to handle embedded newlines. Compare this to VI (UNIX or PC/VI) which CANNOT match a pattern which crosses a line boundary. When searching for a regular expression, you search a single null terminated string. That string can have any character in it, including a new-line (although for the most part the new-line isn't stored). Hence, your script will not work since SED is not seeing the \n in your script as "when you get to the end of this string, start comparing with the next string (pronounced 'line'). When the documentation talks about looking for embedded new lines, what it is talking about is that SED permits the user to 'join' two lines together. What this really means is that the NULL in the first line is replaced by a \n and the second line is concatenated onto the end of the first. The regular expression can now test for an embedded new line since all you have is one string which just so happens to have a \n in the middle. To be honest, I don't feel like mucking with SED long enough to give you a script to do what you want (someone else probably will!) but I think the basic idea is to go through the file, and for each line join the next two lines together with the 'N' command, then you can test to see if that new 'line' is the concatenation of the three you are interested in, if so, do your substitution. Okay, I just reread this message and I know it isn't very clear. On the other hand, neither is SED is not the easiest program to understand either! If you're still stuck, give me a call. -- Joe Boykin Custom Software Systems ...{necntc, frog}!custom!boykin