Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!usc!elroy.jpl.nasa.gov!sdd.hp.com!spool.mu.edu!uunet!convex!usenet From: tchrist@convex.COM (Tom Christiansen) Newsgroups: comp.lang.perl Subject: Re: pattern matching performance Keywords: grep, pattern matching Message-ID: <1991May06.193907.3323@convex.com> Date: 6 May 91 19:39:07 GMT References: <1991May6.161906.21161@ccu.umanitoba.ca> Sender: usenet@convex.com (news access account) Reply-To: tchrist@convex.COM (Tom Christiansen) Organization: CONVEX Software Development, Richardson, TX Lines: 69 Nntp-Posting-Host: pixel From the keyboard of rahardj@ccu.umanitoba.ca (Budi Rahardjo): :I was wondering if anybody could show me the fastest way to match :a pattern in perl. I have a big flatfile (around 17000 lines). :Using UNIX grep takes around 1 or 2 second, but perl's pattern :matching takes 12 secs. :Need advice ... : :-- budi : :Here is a simplified benchmark that I use : :----- cut here ---- :#!/usr/local/bin/perl :# Benchmarking pattern matching :# Sun4 - Sun OS 4.1.1 :# :print "Enter a pattern to grep : "; :$pat = ; chop $pat; : :# Using UNIX grep :$start = time; :open (FLAT,"grep $pat flatfile|"); :while () { print; } :$elapse = time - $start; :print ">>>> UNIX grep takes $elapse sec.\n"; :close(FLAT); : :# Using perl pattern matching :open(FLAT,"flatfile"); :$start = time; :while () { : if (/$pat/) { print;} : } :$elapse = time - $start; :print ">>>> Perl's pattern matching takes $elapse sec.\n"; :close(FLAT); First, don't use a block on your if. It just slows you down. Perl has to go through a bit of overhead because of your block declaration -- I'm sure Larry can much better explain than I the magical contortions he goes through to optimize some of this stuff. The short story is that you want to do one of these: print if /$pat/; /$pat/ && print; Second, your pattern is invariant, so tell perl this by adding a /o modifier to your pattern match. print if /$pat/o; Finally, you could change the loop part to an eval so that you trick the perl compiler into seeing a fixed expression and thus calling a better checker. Why Larry doesn't do this on a /o as well, I do not know, but he doesn't, and this will run faster than the preceding example: open(FLAT,"flatfile"); $start = time; eval "while () { print if /$pat/; }"; $elapse = time - $start; print ">>>> Perl's pattern matching takes $elapse sec.\n"; close(FLAT); --tom -- Tom Christiansen tchrist@convex.com convex!tchrist "So much mail, so little time."