Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!husc6!cmcl2!beta!dph From: dph@beta.UUCP (David P Huelsbeck) Newsgroups: comp.unix.questions Subject: Re: An awk question or two... Message-ID: <10096@beta.UUCP> Date: Tue, 15-Sep-87 18:39:49 EDT Article-I.D.: beta.10096 Posted: Tue Sep 15 18:39:49 1987 Date-Received: Thu, 17-Sep-87 06:23:23 EDT References: <3931@well.UUCP> <27817@sun.uucp> <90@aimt.UUCP> Reply-To: dph@LANL.GOV.ARPA (David P Huelsbeck) Distribution: na Organization: Los Alamos Natl Lab, Los Alamos, N.M. Lines: 85 Keywords: awk ranges Summary: You mean you believe the documentation! In article <90@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes: >In article <27817@sun.uucp>, guy@sun.uucp (Guy Harris) writes: >> > awk -f comline.awk comvar=\"SUB\" ascii.h >> > >> > this is the file comline.awk: >> > >> > comvar { print } >> > >> >> "comvar" is not a legal pattern. A pattern is either a keyword >> (such as BEGIN or END), a relational expression, or a regular expression. [....] >Actually, according to: "Awk - A Pattern Scanning and Processing Language" >by Aho Kernighan and Weinberger (Second Edition), "A variety of expressions >may be used as patterns: regular expressions, arithmetic relational >expressions, >*string-valued expressions*, and arbitrary boolean combinations of these." [...] > Or >is my interpretation of documentation flawed? God knows I've been bitten by >my too-liberal interpretation of documentation before. > >-- >Breck Beatie >uunet!aimt!breck No. (at least I don't think so) The documentation does say that. However, if you read the abstract page you'll find: "*Awk* patterns may include arbitrary boolean combinations of ..." ^^^^^^^^^^^^^^^^^^^^ If you think about it a while this makes sense. When each record is read awk will run down the list of pattern-action pairs, look at each pattern and then either do the action or not do the action. It does it or it doesn't, so it's boolean. Or at least it needs to make a boolean type of decision. What is misleading is the fact that awk allows patterns to be a simple regex in slashes or non-existant in addition to the clearly boolean valued patterns like "a == b". But if you think of the non-existant or default pattern as a shorthand for TRUE or 1 == 1 or whatever, and the /regex/ pattern as "$0 ~ /regex/" then it is clear why a variable or string valued expression will not work. It's really not a "syntax error" as awk claims but rather a semantic error of the "type mismatch" variety. What is needed to make awk behave the way we've been talking about is a new built-in function like: match(str,expr) which is boolean valued, where "str" may be any string and "expr" may be any string which is itself a valid regex. (NOTE: I've never found occation to use a function in a pattern but from the lex source it looks like it ought to work.) The problem is you can't define new functions in standard awk. They're built in at the lowest level just like + and - and all the rest. I have been told there is a new awk out with subroutines/function calls and the like. I haven't seen it. I'd like to but I'd be more inclined to just rewrite awk the way I'd like it than to pay for a new version. There is a program called "bawk" which is similar to awk but source is available for free. I've looked at it but never used it so I can't comment further. So the way to make awk work the way you'd like is to in some fashion rewrite it. Or just use the shell as Guy suggested. The true power of UNIX is using your tools in harmony. The stream-of-bytes-in-stream-of-bytes-out paradigm is what makes UNIX UNIX. Use it. David Huelsbeck dph@lanl.gov {cmcl2,ihnp4}!lanl!dph Sorry for going on so. Say, when will we be seeing comp.awk.questions? #include