Path: utzoo!utgpu!watmath!iuvax!cica!ctrsol!emory!arnold From: arnold@mathcs.emory.edu (Arnold D. Robbins {EUCC}) Newsgroups: comp.unix.questions Subject: Re: Import variables in to awk. Message-ID: <4609@emory.mathcs.emory.edu> Date: 17 Nov 89 00:00:53 GMT References: <10531@thorin.cs.unc.edu> <15919@bloom-beacon.MIT.EDU> <36495@ism780c.isc.com> Reply-To: arnold@emory.UUCP (Arnold D. Robbins {EUCC}) Distribution: na Organization: Math and Computer Science, Emory University, Atlanta GA Lines: 83 OK. Hopefully this is the definitive word on how things work. V7 awk (old awk, /usr/bin/awk on Suns and other 4.3 based machines) awk '....' a=1 b=2 file c=3 file a is set to 1, b to 2, then the files are read and no more assignments are done. This feature was undocumented On my Sun, the value of a and b are NOT available in the BEGIN block. After the first file is read c gets set to 3. Then the next one is read. S5R3.n, n >= 1 nawk (new awk) awk '....' a=1 b=2 file c=3 file a is set to 1, b to 2, and those values ARE available in the BEGIN block. Then the first file is read, then c is set to 3, then the second file is read. The value of c is NOT set in the BEGIN block. There are inconsistencies here, since conceptually the assignments are done when it goes to do a file open, and it "notices" that it's really a variable assignment. But a and b are assigned before any program execution begins, while files aren't opened until after the BEGIN block has been run. Note that the assignment of c is done correctly, after the BEGIN block. GNU Awk 2.11 and S5R4 nawk awk -v z=26 '....' a=1 b=2 file c=3 file z is set to 26 before the BEGIN block is executed. Then the BEGIN block is run. a is set to 1, b to 2, the first file is opened and processed, then c is set to 3, and then the second file is processed. Unfortunately, people had come to rely on the way nawk did assignments before the BEGIN block was run. But yet the behavior was inconsistent. So, to have our cake and eat it too, ALL assignments that are where file names are supposed to be are done after the BEGIN block. But, to make a variable be available in the BEGIN block, the new -v option was added. You must supply a -v option for each variable to be assigned. It is important to note that normal assignments are done AT THE TIME they would have been opened as a file; don't expect c to be set while the first file is being processed. This is something that took some discussion and hammering out between the GNU people (me and David Trueman), Brian Kernighan at Bell Labs (and Al Aho through him), and Randall Howard at MKS. In fact, when Brian first changed his awk to be consistent he got the loudest complaints about needing variable assignments to happen before the BEGIN block was run (Hi Tom!). Adding a command line option was the best compromise we could come up with -- the text of the awk program does not change, just the command line to invoke it, and everyone felt that while it wasn't particularly pretty, we could all live with it. (I mentioned the S5R4 awk above; I can't promise this, but I do know that Brian has made his version of awk, which works as described above, available to them for inclusion is S5R4. Perhaps someone doing S5R4 at AT&T can let us know if it made it in. He also should have gotten his version to the toolchest, but I don't know about that for sure either.) GNU Awk 2.11.1 (version 2.11 at patchlevel 1) has been sent to comp.sources.unix and should be appearing there shortly. Some version of gnu awk will be in 4.4 BSD, when that comes out. *** There is the separate question, "what if I have a filename with an `=' in it?" The short answer is "don't do that". It should perhaps be possible to come up with a simple and consistent rule. I don't know what that rule is right now though, since we haven't given it a lot of thought yet. But I suspect you can look for a change in gawk 2.12 to address this. Any more questions, class? :-) -- Arnold Robbins -- guest account at Emory Math/CS | Laundry increases DOMAIN: arnold@emory.mathcs.emory.edu | exponentially in the UUCP: gatech!emory!arnold PHONE: +1 404 636-7221 | number of children. BITNET: arnold@emory | -- Miriam Hartholz