Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!crdgw1!uunet!convex!newsadm From: tchrist@convex.COM (Tom Christiansen) Newsgroups: comp.unix.questions Subject: Re: shell script to... Keywords: sed, awk, script Message-ID: <1991Apr11.171043.13657@convex.com> Date: 11 Apr 91 17:10:43 GMT References: Sender: newsadm@convex.com (news access account) Reply-To: tchrist@convex.COM (Tom Christiansen) Organization: CONVEX Software Development, Richardson, TX Lines: 87 Nntp-Posting-Host: pixel.convex.com From the keyboard of neil@ms.uky.edu (Neil Greene): :Any sed gurus that would like to explain how to accomplish the following. I :have not masterd the art of sed or awk. : :I have a file that contains drug names and next to the drug name is the drug :group. : :> Dipyrone Analgesic :> Nefopam Analgesic :> Thiosalicylic Acid Analgesic :> Xylazine Analgesic :> Chloramphenicol Antibiotic : :A need a shell script that will read from another (ascii) data file, find an :occurance of a DRUG_NAME, write the line to another (ascii) file and append :the appropriate DRUG_TYPE to the new line. : :# line with drug name in it :xxxx 01/02/90 xxxxxx xxxxx xxx x xxxxxxx Dipyrone .... xxxx xxxxx : :# rewrite new line to new ascci file :xxxx 01/02/90 xxxxxx xxxxx xxx x xxxxxxx Dipyrone .... xxxx xxxxx Analgesic Here's a simple-minded perl script to do this. It reads from "drugs.types" to load the table, then reads stdin and writes stdout according to your spec: open (TYPES, "drugs.types") || die "can't open drugs.types: $!"; while () { split; $types{$_[0]} = $_[1]; } while (<>) { chop; print; study; # compile pattern space for speed foreach $name (keys %types) { if (/\b$name\b/) { print ' ', $types{$name}; last; } } print "\n"; } No checking is done on the input validity in the TYPES file. This would also be a bit slow if you had a big table because of all the re_comp()s that get called. A faster, albeit less obvious way to do this would be to use an eval. This makes it look like a bunch of constant strings, which when combined with the "study" statement, does B-M one better, and really blazes. Another possible speed optimization would be to make the if's into a cascading if/elsif block, which would get internalized into one big switch statement, and perl would jump directly to the right case. open (TYPES, "drugs.types") || die "can't open drugs.types: $!"; while () { split; $types{$_[0]} = $_[1]; } $code = <) { chop; print; EO_CODE for $name (keys %types) { $code .= <