Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!swrinde!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) Newsgroups: comp.editors Subject: Re: perl script request Message-ID: <1991May11.012824.5139@jpl-devvax.jpl.nasa.gov> Date: 11 May 91 01:28:24 GMT References: <2083@qusunc.queensu.CA> Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 65 In article <2083@qusunc.queensu.CA> prastowo@qucis.queensu.CA (Bambang Nurcahyo Prastowo) writes: : Could any perl guru help me rewrite the following sh script in perl ? : The script is a mail sorter that cleans incoming mails (simplifies : headers and removes duplicated blank lines) and save them in mailboxes : named after the first words of email addresses appear in "From:...". Sure. It'd go something like this: ------------------------------------------------------ #!/usr/bin/perl $MAILDIR = "/grad/prastowo/MAIL"; $/ = ''; # paragraph mode $* = 1; # enable multiline matching for ^ # parse header into associative array %hdr = ('FROM', split(/^([-\w]+):[ \t]*/, <>)); # open appropriate file $sd = $hdr{'From'}; $sd =~ s/^.*<\s*(.*)>.*$/$1/; # handle <> form $sd =~ s/[\s@%.!].*//; # delete trailing stuff $sd =~ tr/a-z/A-Z/; # canonicalize to upper case open(STDOUT, ">>$MAILDIR/$sd") || die "Can't open $sd: $!\n"; # write the header (pick your order) print $hdr{'FROM'} if $hdr{'FROM'}; print 'From: ', $hdr{'From'} if $hdr{'From'}; print 'Date: ', $hdr{'Date'} if $hdr{'Date'}; print 'Subject: ', $hdr{'Subject'} if $hdr{'Subject'}; print "\n"; # now do each remaining paragraph while (<>) { $* = 0; s/^\n+//; # delete extra blank lines $* = 1; s/^(Date|From|Subject):/>$1:/g; print; } print "_____________________________________\n\n"; ------------------------------------------------------ I translated it fairly literally, including the fact that addresses like foo!bar@fiddle save to file FOO rather than BAR. I chose to process the message in paragraph mode because it makes it easy to find extra newlines (they show up at the front of the paragraph). The only tricky think is that you have to turn on and off whether ^ matches the beginning of each line in the string or just the beginning of the string. This could fairly easily be restructured to handle multiple messages on the input stream by putting the header code into the final loop and executing it whenever the current paragraph looks like a header. Larry Wall lwall@netlabs.com