Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!rpi!batcomputer!lacey From: lacey@batcomputer.tn.cornell.edu (John Lacey) Newsgroups: comp.unix.questions Subject: Finding words in paragraphs (was: Help a novice: Will "sed" do?) Message-ID: <8421@batcomputer.tn.cornell.edu> Date: 17 Jul 89 16:41:30 GMT Reply-To: lacey@tcgould.tn.cornell.edu (John Lacey) Organization: Cornell Theory Center, Cornell University, Ithaca NY Lines: 46 As regards the question of finding paragraphs in text which contain a particular word, I sent the following reply directly to the asker of the question. But then I saw the reply that no Unix utility could handle this, and I have to disagree. Awk will handle this case with no problem. Certainly the Awk solution is much nicer than the previous proposal. ---------------- Awk is what you want in this case. Try something like this: awk 'BEGIN { FS = ""; RS = "\n"} /the-word-here/' the-filename-here Awk is a series of pattern-action pairs. Whenever text matching the pattern is recognized, the associated action is taken. BEGIN is a special action that matches exactly once, before the input file is read. END is the related pattern for after a file has reached EOF. FS is the field separator, RS is the record separator. So, we set RS to a newline to make each paragraph (separated by a blank line) a different record. Then, we search for the word in question. Patterns in Awk are egrep-type regular expressions, bounded by /'s. I left off the action, to save space. Any missing action is taken to be a print-the-record. You can do this explicitly with a print command. Awk is a lovely language. I write a lot of one liners like this, and I also use it to write reasonably large applications (including a small relational database). If you don't have awk documentation around, there is a book by Aho, Kernighan, and Weinberger (A, W, K) called, appropriately, the AWK Programming Language, that explains the whole thing. Good luck, and cheers, -- John Lacey | Internet: lacey@tcgould.tn.cornell.edu running unattached | BITnet: lacey@crnlthry | UUCP: cornell!batcomputer!lacey "Whereof one cannot speak, thereof one must remain silent." ---Wittgenstein -- John Lacey | Internet: lacey@tcgould.tn.cornell.edu running unattached | BITnet: lacey@crnlthry | UUCP: cornell!batcomputer!lacey "Whereof one cannot speak, thereof one must remain silent." ---Wittgenstein