Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!bloom-beacon!apple!rutgers!njin!princeton!phoenix!bernsten From: bernsten@phoenix.Princeton.EDU (Dan Bernstein) Newsgroups: comp.lang.c Subject: Re: Scrunch blank lines Summary: sed Message-ID: <7472@phoenix.Princeton.EDU> Date: 29 Mar 89 21:56:36 GMT References: <7150@siemens.UUCP> <9900010@bradley> <4896@cbnews.ATT.COM> <26389@cornell.UUCP> <6839@cg-atla.UUCP> <623@gonzo.UUCP> Reply-To: bernsten@phoenix.Princeton.EDU (Dan Bernstein) Distribution: na Organization: Princeton U. Undergrad Math Majors, last time I checked Lines: 35 Dave Brower asks for a filter ``that will take "blank line" style cpp output on stdin and send to stdout a scrunched version with appropriate #line directives.'' If we may combine built-in utilities to handle the problem, then this 9-line shell script will do it (combine the last two lines to make it 8): #!/bin/sh ( tr XY '\375\376' | sed 's/^\(.\)\(.*\)/X\1\2Y/ tend i\ X#line d :end =' | uniq | tr '\012X' ' \012'; echo ''; ) | sed 's/Y.*//' | tr '\375\376' XY | sed -n '1!p' The idea is reasonably simple; one could use, e.g., grep -n '.' to obtain a similar solution. This particular version destroys any \375 and \376 you may have in your source, and because it's based on sed, it omits the final line if it has no newline. It has been tested successfully on a wide variety of sources, and I must say the next time I feel compelled to look at cpp output, I'll definitely use it. > I have two entries so far, one in "lex" and another in "awk". Both are > less than 20 lines. It will be interesting to compare timings between > awk, gawk, nawk, lex and flex. Ahem? Are we forgetting sed here? (Then again, I hate awk, love sed, and prefer C to lex. I'd rather have a sed script twice as slow as an awk script. But that's just personal bias.) If you time, make sure to test out on really long sources too. I'd hate to see my script penalized just because it totals eight+sh execs :-). ---Dan Bernstein, bernsten@phoenix.princeton.edu