Newsgroups: comp.mail.misc Path: utzoo!utgpu!trigraph!john From: john@trigraph.uucp (John Chew) Subject: Re: Perl version of from (Was: Re: from.sed (v1.2)) Message-ID: <1989Dec29.170942.16243@trigraph.uucp> Sender: "John J. Chew" Reply-To: "John J. Chew" Organization: Trigraph Inc., Toronto, Canada References: <1989Dec20.222732.5633@trigraph.uucp> Date: Fri, 29 Dec 89 17:09:42 GMT In article <1989Dec20.222732.5633@trigraph.uucp> I posted a sed script that does the job of from(1). In Johan Vromans posted a perl script that does the same thing. I've tried both out on various mailboxes and have come to the following conclusions: 1. On small mailboxes, the compilation-time overhead of perl makes it a pig. 2. On large mailboxes, especially those containing long messages, perl can catch up to sed. 3. The following patch to Johan Vromans' perl script speeds it up by as much as 30% on large files, by tightening the search-for-From_ loop. *** old/from.jv.pl Fri Dec 29 12:05:15 1989 --- from.jv.pl Fri Dec 29 11:42:13 1989 *************** *** 30,40 # read through input file(s) ! while ( $line = <> ) { ! chop ($line); ! ! # scan until "From_" header found ! next unless $line =~ /^From\s+(\S+)\s+.*(\w{3}\s+\d+\s+\d+:\d+)/; $from = $1; $date = $2; if ( $date eq "" || $from eq "" ) { --- 30,39 ----- # read through input file(s) ! while (<>) { ! next unless /^From /; ! chop; ! next unless /^From\s+(\S+)\s+.*(\w{3}\s+\d+\s+\d+:\d+)/; $from = $1; $date = $2; if ( $date eq "" || $from eq "" ) { *************** *** 38,44 $from = $1; $date = $2; if ( $date eq "" || $from eq "" ) { ! print STDERR "Possible garbage: $line\n"; next; } --- 37,43 ----- $from = $1; $date = $2; if ( $date eq "" || $from eq "" ) { ! print STDERR "Possible garbage: $_\n"; next; } I'll keep both scripts around for now. I actually prefer the notion of writing such things in perl, but when your mail machine is a heavily- used VAX-11/750 you can't afford luxuries.... John -- john j. chew, iii phone: +1 416 425 3818 AppleLink: CDA0329 trigraph, inc., toronto, canada {uunet!utai!utcsri,utgpu,utzoo}!trigraph!john dept. of math., u. of toronto poslfit@{utorgpu.bitnet,gpu.utcs.utoronto.ca}