Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!cs.utexas.edu!usc!henry.jpl.nasa.gov!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: speed: V2 verses V3 Message-ID: <6628@jpl-devvax.JPL.NASA.GOV> Date: 18 Dec 89 22:20:10 GMT References: <1808@uvaarpa.virginia.edu> <6609@jpl-devvax.JPL.NASA.GOV> <4047@convex.UUCP> <1989Dec18.032836.16434@psuvax1.cs.psu.edu> <4055@convex.UUCP> <1989Dec18.112735.4443@psuvax1.cs.psu.edu> Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 56 In article <1989Dec18.112735.4443@psuvax1.cs.psu.edu> flee@shire.cs.psu.edu (Felix Lee) writes: : Here's timings for a perl script that counts word frequencies. : % time perl-2 wf.pl /etc/termcap >/dev/null : 13.3u + 0.9s = 0:15 (95%); (0k+864k)/92k (0+0)io (0f+80r)pg+0sw : % !! : 13.4u + 0.7s = 0:14 (98%); (0k+872k)/92k (0+0)io (0f+79r)pg+0sw : % !! : 13.3u + 0.8s = 0:14 (100%); (0k+872k)/92k (0+0)io (0f+79r)pg+0sw : % time perl-3 wf.pl /etc/termcap >/dev/null : 18.6u + 1.0s = 0:20 (95%); (0k+944k)/84k (0+0)io (0f+73r)pg+0sw : % !! : 18.7u + 0.9s = 0:20 (95%); (0k+944k)/84k (0+0)io (0f+72r)pg+0sw : % !! : 18.7u + 0.9s = 0:20 (94%); (0k+944k)/84k (0+0)io (0f+73r)pg+0sw : : : This is on a Sun-4. /etc/termcap is 146k, about 32000 total words, : about 2000 different words, average word length is 3 chars. : : If you want worse behavior, try /usr/dict/words. About 24000 words, : every one unique, average length 7 chars. I get 103.0u for perl-2 and : 158.2u for perl-3. : : Here's the script. : : #!/usr/bin/perl : # Count word frequency. : while (<>) { : foreach $k (split(/[^a-zA-Z]+/)) { : $k =~ tr/A-Z/a-z/, ++$freq{$k} if ($k); : } : } : foreach $k (sort downfreq keys(freq)) { : printf "%5d %s\n", $freq{$k}, $k; : } : sub downfreq { : ($freq{$b} - $freq{$a}) || ($a gt $b); : } This particular script is exercising almost none of the constructs that were sped up in perl 3, and several of the constructs that were slowed down. In particular, the sorting is probably a little slower for a couple of reasons. First, subroutine calls run a little slower due to the code to handle array returns. Second, associative array references are a bit slower due to the check for dbm arrays, and making sure associative arrays don't create themselves when checked by the "defined" function. The foreach is also a bit slower due to allowing for nested references to the same array. Disclaimer: the above is merely well-informed speculation. Profiling might well pinpoint some other culprit. Larry