Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!uw-beaver!ubc-cs!van-bc!eslvcr!ted From: ted@eslvcr.UUCP (Ted Powell) Newsgroups: comp.unix.questions Subject: Re: Help a novice: Will "sed" do? Summary: Consider awk for processing multiline records Keywords: multiline records, awk Message-ID: <165@eslvcr.UUCP> Date: 20 Jul 89 03:40:44 GMT References: <2180@umbc3.UMBC.EDU> <10540@smoke.BRL.MIL> Reply-To: ted@eslvcr.wimsey.bc.ca (Ted Powell) Followup-To: comp.unix.questions Organization: Entropy Limited, Vancouver, BC Lines: 40 In article <10540@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: >In article <2180@umbc3.UMBC.EDU> rostamia@umbc3.UMBC.EDU (Dr. Rouben Rostamian) writes: >>I need a command or a script that searches a text file for a given >>word or pattern and prints out all paragraphs that contain that word >>or pattern. Paragraphs are blocks of text separated by one or >>more blank lines. > >It's pretty hard to do this with standard UNIX text-file utilities, >because most of them work on a line-at-a-time basis. That means when >you find the pattern, it's too late to output the preceding lines. Use awk! See section 3.4 Multiline Records in: The AWK Programming Language Aho, Alfred V., Kernighan, Brian W., Weinberger, Peter J. Addison-Wesley Series in Computer Science ISBN 0-201-07981-X Given input data with paragraphs separated by blank lines, the following passes those paragraphs containing "New York" (example taken from page 83): ... | awk 'BEGIN { RS = ""; ORS = "\n\n" }; /New York/' | ... Setting RS (input Record Separator) to null makes awk take everything between successive blank lines as a record. Setting ORS (Output Record Separator) to two newlines gives a blank line between output records. The example could also be done as: awk ' BEGIN { RS = ""; ORS = "\n\n" } /New York/ ' input-file >output-file or the program can be hidden away in a file ( -f progfile ). Patterns can be _very_ complex, and you can have multiple patterns with corresponding actions. In the example, the action is unspecified, and defaults to outputting the current record. If you don't have access to the book, see the man page. Note that in SVR3/386 (and possibly other flavours) there is AWK(1) and NAWK(1) (New AWK). (Old awk is being kept around for a while, presumably for compatibility reasons.) The book corresponds to NAWK(1). At least in SVR3/386, awk/nawk come with the basic system. If you haven't ever used awk, give it a try. If you haven't read the book, check it out -- it has all kinds of useful examples in a surprisingly wide range of fields.