Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!paperboy!think.com!sdd.hp.com!elroy.jpl.nasa.gov!jpl-devvax!lwall
From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
Newsgroups: comp.lang.perl
Subject: Re: Style Question
Keywords: split pattern match newbie
Message-ID: <11516@jpl-devvax.JPL.NASA.GOV>
Date: 20 Feb 91 18:45:49 GMT
References: <1991Feb20.131637.3541@NCoast.ORG>
Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
Organization: Jet Propulsion Laboratory, Pasadena, CA
Lines: 37

In article <1991Feb20.131637.3541@NCoast.ORG> jeffl@NCoast.ORG (Jeff Leyser) writes:
: Given an input file with multiple lines of the form:
: 
: 25.916 secs, 208 bytes-sec
: 
: If I want the number of seconds, and the bytes per second I can loop over
: the file and say either:
: 	m/(.*) secs, (.*) bytes-sec/;
: 	$seconds = $1;
: 	$bytes = $2;
: or
: 	($seconds,$dum1,$bytes,$dum2) = split(/ /);
: 
: Which is considered good perl form, and why?  The split is probably easier
: to maintain in the long run, but how much memory/time is being wasted
: creating the dummy variables?  Or is a split so much "better" than a
: pattern match that the cost of the dummy variables are neglible?  Or have
: I completely missed the "best" way?

I'd probably write it like this:

	($seconds, $bytes) = /(.*) secs, (.*) bytes-sec/;

or, to avoid so much backtracking,

	($seconds, $bytes) = /([\d.]+) secs, (\d+) bytes-sec/;

On the other hand, if there are more fields, and you know that you only want
to extract the numbers from a line, you might say:

	($secs,$bytes,$cats,$dogs) = split(/[^\d.]+/);

As to which one of *those* is better style, I couldn't say.  The first is
a little more self-documenting, and the second a little more generalized.
Disputandum non est de gustibus.

Larry