Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!uunet!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.jpl.nasa.gov (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: Question about split Message-ID: <1991May31.180547.28481@jpl-devvax.jpl.nasa.gov> Date: 31 May 91 18:05:47 GMT References: Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 46 In article victor@watson.ibm.com writes: : I'm a little confused about perl's behavior in split. If you run (on : 4.003) then code below : : #!/usr/local/bin/perl : : #Test how certain patterns split : sub test { : ($a) = @_; : @a = split(/:/,$a); : print "split('$a')=(",join(',',@a),") count=",scalar(@a),"\n"; : } : : &test('a:b:c: '); : &test('a:b:c:'); : &test('a:b:c'); : #end of program : : : You get the results: : : split('a:b:c: ')=(a,b,c, ) count=4 : split('a:b:c:')=(a,b,c) count=3 : split('a:b:c')=(a,b,c) count=3 : : Which I find a little counter-intuitive: I thought that perl should : distinguish between the second and third cases. I would have thought : that the output of the second case should have been : : split('a:b:c:')=(a,b,c,) count=4 A careful reading of the documentation for split will point out the fact that null trailing fields are stripped if no limit is specified. : Why is it done the way that it is? The primary reason is that it surprises people less frequently. Especially when they split on whitespace and there's trailing whitespace, such as an unstripped newline. Note that the semantics of individual fields is much the same, since an undefined field evaluates to the same value as a null field. It's only of concern to people counting fields. And you can get the other behavior by supplying a limit. Other than that, no particular reason... :-) Larry