Path: utzoo!mnetor!uunet!husc6!rutgers!umd5!purdue!i.cc.purdue.edu!j.cc.purdue.edu!pur-ee!iuvax!inuxc!ihnp4!ttrdc!levy From: levy@ttrdc.UUCP (Daniel R. Levy) Newsgroups: comp.bugs.sys5 Subject: Re: awk bug Message-ID: <2161@ttrdc.UUCP> Date: 7 Feb 88 21:19:25 GMT References: <672@pttesac.UUCP> <3748@megaron.arizona.edu> Organization: AT&T, Skokie, IL Lines: 52 In article <3748@megaron.arizona.edu>, rupley@arizona.edu (John Rupley) writes: > In article <672@pttesac.UUCP>, vanam@pttesac.UUCP (Marnix van Ammers) writes: > > While trying to install a new program I ran across a bug in our Sys > > V, release 2.1.1 (AT&T 3B20) awk. In our awk the following pattern > > always matches (even if there are 5 or less fields on the current > > line): > > if $6 != "" > > This does not happen on the awk on my 3B1 version 3.51 . > > Is this a known bug or what? > Could it be a corrupt copy of awk on your release 2 system? > The following code excutes properly with my SysV.r2 awk and > with the new awk (your 3.51 version?): > echo $* | awk '$6 != "" {print "$6_!=_zerolength", NR, NF, $6}' > echo $* | awk '{if ($6 != "")print "$6_!=_zerolength", NR, NF, $6}' Alas, I must plead guilty (even though I'm not responsible for awk, I'm still a Death-Starian) for awk's behavior in this manner on the 3B20 (we're running 2.0v3 here). It's coming from a dereference of a null pointer (the string "f{\0" is present beginning at location zero in a 3B20 process). If Rupley is using a VAX, on the other hand, everything will seem to be hunkey dorey (location 0 in a VAX [System V UNIX] process contains a zero byte, which is tantamount to a null string). I would posit that, just as when programming in C, testing a field without first knowing that it is valid (the field count is high enough) is poor programming practice. I will eat these words if someone can show me awk documentation that says that an undefined positional parameter is guaranteed to be null/0 just as an undefined member of an array or previously unused variable is guaranteed to be. (I've written many a line of awk code using much the same care I would use with C, and never tripped over this problem.) Barring such a guarantee, and certainly in the present situation, it is better practice, given that one knows that there may be less than six positional parameters in an input record, to use NF >= 6 { action using $6 } than it is to use $6 != "" { action using $6 } just as you would not blithely want to do (in C): main(argc,argv) char **argv; { foo(argv[6]); /* what if argc < 6 ? */ } -- |------------Dan Levy------------| Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa, | an Engihacker @ | }!ttrdc!ttrda!levy | AT&T Computer Systems Division | Disclaimer? Huh? What disclaimer??? |--------Skokie, Illinois--------|