Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu!unmvax!pprg.unm.edu!hc!lll-winken!uunet!mcvax!ukc!icdoc!bilpin!jim From: jim@bilpin.UUCP (Jim G) Newsgroups: comp.lang.c Subject: Re: C comment stripper shell script? -> use sed pipeline Summary: AWK rules OK Message-ID: <1583@bilpin.UUCP> Date: 30 Mar 89 11:46:55 GMT References: <1467@bilpin.UUCP> <2216@solo8.cs.vu.nl> Organization: SRL, London, England Lines: 47 #{ v_langC.2 } IN ARTICLE <2216@solo8.cs.vu.nl>, maart@cs.vu.nl (Maarten Litmaath) WRITES: > jim@bilpin.UUCP (Jim G) [**THAT'S ME, FOLKS!**] writes: > \#{ zapcom.sh } > \# Remove comments from a C program > \# sed removes comment strings which begin and end on the same line > \# awk removes comment strings which extend across multiple lines > \# sed/awk both handle nesting of comments within their context [small but perfectly formed awk/sed script deleted] > > Aha! You're using a SHELL script! Well, in that case there's another word > for my `sed approach' :-) > No awk necessary. This pipeline is reasonably fast too! [immense sed script deleted] Although I don't dispute the efficacy of the supplied script ( I haven't checked it out, though ), I think that this m-iii-ght be taking a preference for sed a m-iii-te too far. My 3 line sed + 13 line awk script has been replaced by a 101 line script with 66 lines of sed - hmmm. Although awk is undoubtedly slower than sed, I use it in preference for solving editing problems which can be defined on a field basis, as I find it much easier to conceptualise solutions; I do not find the sed syntax or operation conducive to an intuitive problem/solution association ( obviously some peculiarity in how my brain, errrm, works ). I aimed for conciseness and a simple, balanced structure in the code (rather than maximum efficiency, or universal application), as this is easier for people (including me) to understand, and therefore alter/improve, if they wish; especially for novice users, who would probably feel safe in tinkering with zapcom.sh, but would probably have to be restrained and sedated after seeing Cstrip :-) Also, zapcom.sh is not universally applicable, in that it requires comment delimiters to be themselves delimited by white space/EOL (so awk can treat them as individual fields); and it won't handle correctly comment delimiters embedded in quotes. There obviously comes a point where the effort required to handle a special case outweighs the benefit achieved; I considered these cases to come into that category. We have now had a reasonable number of constructive postings on this subject to give all interested parties a good set of approaches from which to choose. Thankyou and goodnight ... -- Programmers' maxim : If it's not aesthetically pleasing, it's probably wrong.