Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!sdd.hp.com!decwrl!bacchus.pa.dec.com!news
From: vixie@volition.pa.dec.com (Paul Vixie)
Newsgroups: comp.lang.perl
Subject: fast grep?
Message-ID: <1990Oct15.211219.8543@wrl.dec.com>
Date: 15 Oct 90 21:12:19 GMT
Sender: news@wrl.dec.com (News)
Organization: DEC Western Research Lab
Lines: 26

I've a need to grep for a simple pattern in a large file from within a perl
script.  Simple means no metacharacters, large means many megabytes.  So
far it looks like "netgrep" -- the B M thing that later turned into GNU grep,
is the clear winner.

	net/bm/gnu grep...
		0.3u 1.5s 0:09 19% 20+20k 284+0io 0pf+0w
		0.3u 1.3s 0:11 15% 20+20k 285+1io 0pf+0w
		0.3u 0.8s 0:01 92% 20+20k 0+0io 0pf+0w

	perl  "while (<>) { print if (/$pat/o); }"...
		5.3u 1.3s 0:12 53% 151+63k 208+0io 3pf+0w
		5.3u 1.3s 0:09 68% 152+63k 63+0io 3pf+0w
		5.3u 2.1s 0:16 45% 152+63k 280+0io 3pf+0w

	perl  "print grep(/$pat/o, <>);"...
		7.6u 5.4s 0:18 69% 136+3894k 284+1io 3pf+0w
		7.8u 5.1s 0:19 67% 149+3878k 282+1io 3pf+0w
		7.7u 4.4s 0:17 67% 137+3920k 282+1io 3pf+0w

Have I found something that perl can't do as well as C?  With this
kind of variance, it's cheaper to fork/exec a bmgrep.
--
Paul Vixie
DEC Western Research Lab	<vixie@wrl.dec.com>
Palo Alto, California		...!decwrl!vixie