Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!apple!voder!pyramid!prls!mips!wyse!vsi1!altnet!uunet!mcvax!hp4nl!philmds!leo
From: leo@philmds.UUCP (Leo de Wit)
Newsgroups: comp.unix.questions
Subject: Re: sed script to combine blank lines?
Keywords: sed
Message-ID: <836@philmds.UUCP>
Date: 16 Oct 88 19:23:59 GMT
References: <192@vlsi.ll.mit.edu> <136@nascom.UUCP>
Reply-To: leo@philmds.UUCP (Leo de Wit)
Distribution: comp
Organization: Philips I&E DTS Eindhoven
Lines: 46

In article <136@nascom.UUCP> rar@nascom.UUCP (Alan Ramacher) writes:
|In article <192@vlsi.ll.mit.edu>, young@vlsi.ll.mit.edu (George Young) writes:
|> Is there a 'sed' wizard out there?  I often want to take a big ascii file
|> (like a .c file after cc -E) and collapse each group of 'blank' lines
|> into exactly one blank line.  'Blank' here is any combination of blanks,
|> tabs and maybe ^L's.  It looks from the documentation that sed should do this
|> quite neatly, using the multiple line pattern space commands with imbedded
|> newlines, but I sure can't figure out how.  I'd prefer the resulting blank
|> line to be just a newline.
|
|sed is not powerful enuf for the job, but a simple awk script will
|work. If you have difficulties writting it, let me know and I will
|supply one. Good luck.

I already mailed George a solution, but couldn't leave this one alone...
Sed is most certainly powerful enough - I'll show you in a minute - ;
in fact, I think for such a typical text processing job sed is to be
preferred.  And a not too unimportant reason for that is its speed.

And here's my sed-solution; note that tab and formfeed have been coded
as ^I and ^L so your pager isn't fooled; you should of course use the
control codes in real.

(using /bin/sh as command interpreter: )

sed -n -e '
/^[ ^I^L]*$/{
    s/^.*$//p
    : again
    n
    s/^[ ^I^L]*$//
    t again
}
p' your_file

Explanation: whenever you read a line containing only blank characters
(i.e. satisfying the first pattern), print just one newline. Discard
any blank lines that follow (the 'again' loop). When you're through
with the 'first pattern subroutine 8-)' print the non-blank line that's
now in the pattern space. Simple enough, huh ?

                                            Leo.

P.S. I don't doubt it can be done with awk (it could even be programmed
with the shell). I however doubt it will be nearly as fast as the sed
solution.