Xref: utzoo comp.binaries.ibm.pc.d:9760 comp.sys.ibm.pc.misc:2370 Path: utzoo!attcan!uunet!snorkelwacker!apple!vsi1!wyse!bob From: bob@wyse.wyse.com (Bob McGowen x4312 dept208) Newsgroups: comp.binaries.ibm.pc.d,comp.sys.ibm.pc.misc Subject: Re: Is AWK up to this application? Keywords: awk Message-ID: <2998@wyse.wyse.com> Date: 5 Oct 90 23:27:05 GMT References: <1661@wjvax.UUCP> Sender: news@wyse.wyse.com Reply-To: bob@wyse.UUCP (Bob McGowen x4312 dept208) Followup-To: comp.binaries.ibm.pc.d Organization: Wyse Technology Lines: 50 In article <1661@wjvax.UUCP> mario@wjvax.UUCP (Mario Dona) writes: > >Does anyone know of a way of extracting information from a text file >which contains variable length fields, and outputting it in a different >format? For example, I have a text file which contains names and >addresses as shown below. > >John Doe 1563 Meadow Lane San Jose, CA 94325 more stuff ---> ... >^ ^ ^ >| | |_____ column 45 >| |______________________________ column 20 >|_________________________________________________ column 1 > >I want to extract the names and addresses and output it in the following >format: > >John Doe >1563 Meadow Lane >San Jose, CA 94325 ..... >Can this be done using AWK, and if so how? Or is there some other way? > The answer re awk is that it depends. If all the fields are separated by spaces, in varying numbers, then you have problems using awk. If the file has (or can be recreated with) tabs (or some other separator character like a colon or |) between each set of fields then it is relatively easy. (Ie. if the white space marked below with the carets were a tab in each instance.) John Doe 1563 Meadow Lane San Jose, CA 94325 more stuff ---> ^^^^^^^^^^^ ^^^^^^^^^ ^ You could then use the following awk code to process your file: awk -F'->' '{printf "%s\n%s\n%s\n",$1,$2,$3}' file_to_process Note that the -> is used to represent a literal tab. If you use some other character to separate the fields, then substitute it. If the file is spaces between all the visible printing characters, your primary problem will be cases where the names are of variable size. For instance, San Jose and New Orleans vs Denver and Oakland or Meadow Lane vs Meadow Lane Court. E-mail me if you would like to discuss this in more detail. Bob McGowan (standard disclaimer, these are my own ...) Product Support, Wyse Technology, San Jose, CA ..!uunet!wyse!bob bob@wyse.com