Path: utzoo!censor!geac!torsqnt!news-server.csri.toronto.edu!cs.utexas.edu!sun-barr!newstop!texsun!letni!mic!convex!convex.COM From: tchrist@convex.COM (Tom Christiansen) Newsgroups: comp.lang.perl Subject: Re: Fast way to join lines ? Summary: Dan's C version isn't worth the bother Message-ID: <109688@convex.convex.com> Date: 2 Dec 90 03:47:11 GMT References: <9830001@hpfcso.HP.COM> <2967:Dec122:39:3790@kramden.acf.nyu.edu> Sender: usenet@convex.com Reply-To: tchrist@convex.COM (Tom Christiansen) Organization: CONVEX Software Development, Richardson, TX Lines: 106 In article <2967:Dec122:39:3790@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: :In article <9830001@hpfcso.HP.COM> hai@hpfcso.HP.COM (Hai Vo-Ba) writes: :> I am using perl to join every N lines of a very large file :> together and wonder what is the faster way to do this: (code restored for further reference) :> :> $\ = "\n"; # set output record separator :> :> while (<>) { :> chop; # strip record separator :> $line .= $_; :> if (($. % 32) == 0) { :> print $line; :> $line = ''; :> } :> } :> :> if ($line ne '') { print $line; } Dan then writes: :The faster way is something like this: : :#include :main() :{ : int ch; int t = 33; : while ((ch = getchar()) != EOF) : { : if (ch == '\n') if (--t) continue; else t = 33; : putchar(ch); : } :} Well, I'm afraid we've got just a few problems here. The first one is that the C code doesn't do what the Perl code does, and the poster requested a faster way to do the same thing. The Perl construct "while (<>)" is not equivalent to "while ()". The construct used by the original poster will traverse its command line argument list and treat it as one continuous input stream, correctly processing any "-" arguments, and defaulting to stdin if no arguments are given. Dan's code only consults stdin, so it's not as functional. The second problem is that (as I mentioned before) while it's good to maintain perspective of using the right tool for the job at hand, this *IS* comp.lang.perl, and the poster seemed to be clearly searching for a perlian solution to his problem. How do you, Dan, know that this wasn't just a code fragment extracted for demonstration purposes from a larger program of the posters? Look at it this way: if I hung around comp.lang.c and kept posting Perl solutions to people's C questions, it would eventually grate on people's nerves. A non-productive flame war would start up that would waste net bandwidth, the readers' time, and just generally rain on everyone's parade unnecessarily. We've had a very flame-free, productive little group here since its inception, so let's keep it that way, OK? The third problem is that the poster asked for a faster way. There are several interpretations of faster, including but not limited to faster writing time, faster compile time, faster debugging time, and faster run time. First let me offer a faster Perl version of the poster's original code: while (<>) { chop if $. % 32; print; } If this doesn't need to be part of another program, you might as well just do it this way: perl -pe 'chop if $. % 32' or else perl -pe 'chop if $. & 31' Now, let's first talk run time here. On my 2250-line termcap file, Dan's C program (which you'll recall doesn't do all that the Perl one does) runs in this much time: 0.450524 real 0.340401 user 0.054859 sys whereas my Perl one-liner runs in just this much time: 0.684193 real 0.450535 user 0.083110 sys I find that pretty respectable; I don't think we're going to quibble about a couple seconds, let alone eleven hundredths of a second of user time. [ I probably shouldn't even mention that mine if we eat the whitespace, mine can be reduced to 12 bytes, and Dan's to 130, but I just did anyway. :-/ ] As far as I'm concerned, and I'll bet you this goes for most of the rest of the readership of this newsgroup as well, anything that you can express as a quick one-liner without having to go into an editor (let alone compile an a.out!) is worth doing that way. Those 0.11 seconds of user time you lost on the run is more than made up for in how fast it took you to write and run the Perl code. Furthermore, it's a lot more legible because its complexity is drastically reduced, which means it'll be more maintainable as well. --tom Brought to you by Super Global Mega Corp .com