Path: utzoo!attcan!uunet!convex!convex.com!tchrist From: tchrist@convex.com (Tom Christiansen) Newsgroups: comp.lang.perl Subject: Frequently Asked Questions about Perl - with Answers [Monthly posting] Message-ID: <108315@convex.convex.com> Date: 6 Nov 90 21:42:04 GMT Expires: 7 Dec 90 05:00:00 GMT Sender: usenet@convex.com Reply-To: tchrist@Convex.COM (Tom Christiansen) Organization: Convex Computer Corp, Richardson, TX Lines: 553 [Last changed: $Date: 90/11/06 15:00:03 $ by $Author: tchrist $] This article contains answers to some of the most frequently asked questions in comp.lang.perl. They're all good questions, but they come up often enough that substantial net bandwidth can be saved by looking here first before asking. Before posting a question, you really should consult the Perl man page; there's a lot of information packed in there. Some questions in this group aren't really about Perl, but rather about system-specific issues. You might also consult the Most Frequently Asked Questions list in comp.unix.questions for answers to this type of question. This list is maintained by Tom Christiansen. If you have any suggested additions or corrections to this article, please send them to him at either or . Special thanks to Larry Wall for reviewing this list for accuracy and especially for writing and releasing perl in the first place. List of Questions: 1) What is Perl? 2) Where can I get Perl? 3) How can I get Perl via UUCP? 4) Where can I get documentation and examples for Perl? 5) Are archives of comp.lang.perl available? 6) Is Perl available for machine FOO? 7) What are all these $@%<> signs and how do I know when to use them? 8) Why don't backticks work as they do in shells? 9) How come Perl operators have different precedence than C operators? 10) How come my converted awk/sed/sh script runs more slowly in Perl? 11) There's an a2p and an s2p; why isn't there a p2c? 12) Where can I get undump for my machine? 13) How can I call my system's unique functions from Perl? 14) Where do I get the include files to do ioctl() or syscall()? 15) Why doesn't "local($foo) = ;" work right? 16) How can I detect keyboard input without reading it? 17) How do I make an array of arrays? 18) How can I quote a variable to use in a regexp? 19) Why do setuid Perl scripts complain about kernel problems? 20) How do I open a pipe both to and from a command? 21) How can I change the first N letters of a string? To skip ahead to a particular question, such as question 17, you can search for the regular expression "^17)". 1) What is Perl? A programming language, by Larry Wall Here's the beginning of the description from the man page: Perl is an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It's also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). It combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC-PLUS.) Expression syntax corresponds quite closely to C expression syntax. Unlike most Unix utilities, Perl does not arbitrarily limit the size of your data--if you've got the memory, Perl can slurp in your whole file as a single string. Recursion is of unlimited depth. And the hash tables used by associative arrays grow as necessary to prevent degraded performance. Perl uses sophisticated pattern matching techniques to scan large amounts of data very quickly. Although optimized for scanning text, Perl can also deal with binary data, and can make dbm files look like associative arrays (where dbm is available). Setuid Perl scripts are safer than C programs through a dataflow tracing mechanism which prevents many stupid security holes. If you have a problem that would ordinarily use sed or awk or sh, but it exceeds their capabilities or must run a little faster, and you don't want to write the silly thing in C, then Perl may be for you. There are also translators to turn your sed and awk scripts into Perl scripts. 2) Where can I get Perl? From any comp.sources.unix archive. These machines definitely have it available for anonymous FTP: uunet.uu.net 192.48.96.2 tut.cis.ohio-state.edu 128.146.8.60 jpl-devvax.jpl.nasa.gov 128.149.1.143 3) How can I get Perl via UUCP? You can get it from the site osu-cis; here is the appropriate info, thanks to J Greely or . E-mail contact: osu-cis!uucp Get these two files first: osu-cis!~/GNU.how-to-get. osu-cis!~/ls-lR.Z Current Perl distribution: osu-cis!~/perl/3.0/kits@36/perl.kitXX.Z (XX=01-32) osu-cis!~/perl/3.0/patches/patch37.Z How to reach osu-cis via uucp(L.sys/Systems file lines): # # Direct Trailblazer # osu-cis Any ACU 19200 1-614-292-5112 in:--in:--in: Uanon # # Direct V.32 (MNP 4) # dead, dead, dead...sigh. # #osu-cis Any ACU 9600 1-614-292-1153 in:--in:--in: Uanon # # Micom port selector, at 1200, 2400, or 9600 bps. # Replace ##'s below with 12, 24, or 96 (both speed and phone number). # osu-cis Any ACU ##00 1-614-292-31## "" \r\c Name? osu-cis nected \c GO \d\r\d\r\d\r in:--in:--in: Uanon Modify as appropriate for your site, of course, to deal with your local telephone system. There are no limitations concerning the hours of the day you may call. 4) Where can I get documentation and examples for Perl? For now, the best source is the man page, all ~74 troffed pages of it. There's a book in the works, but that won't be out until the end of 1990; it will be published as a Nutshell Handbook by O'Reilly & Associates. For examples of Perl scripts, look in the Perl source directory in the eg subdirectory. You can also find a good deal of them on tut in the pub/perl/scripts/ subdirectory. A nice reference card by Johan Vromans is also available; originally in postscript form, it's now also available in TeX and troff forms, although these don't print as nicely. The postscript version can be FTP'd from tut and jpl-devvax. A brief (~2-hour) tutorial by Tom Christiansen is available in troff form on tut in pub/perl/scripts/tchrist/slides/. Numerous examples of his are also available there. Additionally, USENIX has been sponsoring tutorials on Perl at their system administration and general conferences. You might consider attending one of these. You should read the USENET comp.lang.perl newsgroup for all sorts of discussions regarding the language, bugs, features, history and trivia. Larry Wall is a very frequent poster here, as well as many other seasoned perl programmers. 5) Are archives of comp.lang.perl available? Not at the moment; however, if someone on the Internet should volunteer the disk space, something might be able to be arranged, as archives have been kept. 6) Is Perl available for machine FOO? Perl comes with an elaborate auto-configuration script that allows Perl to be painlessly ported to a wide variety of platforms, including non-UNIX ones. Amiga and MS-DOS binaries are available on jpl-devvax for anonymous FTP. Try to bring Perl up on your machine, and if you have problems, post to comp.lang.perl about them. 7) What are all these $@%<> signs and how do I know when to use them? Those are type specifiers: $ for scalar values, @ for indexed arrays, and % for hashed arrays. Always make sure to use a $ for single values and @ for multiple ones. Thus element 2 of the @foo array is accessed as $foo[2], not @foo[2]. You could use @foo[1..3] for a slice of three elements of @foo; this is the same as ($foo[1], $foo[2], $foo[3]). While there are a few places where you don't actually need these type specifiers, except for files, you should always use them. Note that is NOT the type specifier for files; it's the equivalent of awk's getline function, that is, it reads a line from the handle FILE. When doing open, close, and other operations besides the getline function on files, do NOT use the brackets. Normally, files are manipulated something like this (with appropriate error checking added if it were production code): open (FILE, ">/tmp/foo.$$"); print FILE "string\n"; close FILE; If instead of a filehandle, you use a normal scalar variable with file manipulation functions, this is considered an indirect reference to a filehandle. For example, $foo = "TEST01"; open($foo, "file"); After the open, these two while loops are equivalent: while (<$foo>) {} while () {} as are these two statements: close $foo; close TEST01; 8) Why don't backticks work as they do in shells? Because backticks do not interpolate within double quotes in Perl as they do in shells. Let's look at two common mistakes: 1) $foo = "$bar is `wc $file`"; This should have been: $foo = "$bar is " . `wc $file`; But you'll have an extra newline you might not expect. This does not work as expected: 2) $back = `pwd`; chdir($somewhere); chdir($back); Because backticks do not automatically eat trailing or embedded newlines. The chop() function will remove the last character from a string. This should have been: chop($back = `pwd`); chdir($somewhere); chdir($back); 9) How come Perl operators have different precedence than C operators? Actually, they don't; all C operators have the same precedence in Perl as they do in C. The problem is with a class of functions called list operators, e.g. print, chdir, exec, system, and so on. These are somewhat bizarre in that they have different precedence depending on whether you look on the left or right of them. Basically, they gobble up all things on their right. For example, unlink $foo, "bar", @names, "others"; will unlink all those file names. A common mistake is to write: unlink "a_file" || die "snafu"; The problem is that this gets interpreted as unlink("a_file" || die "snafu"); To avoid this problem, you can always make them look like function calls or use an extra level of parentheses: (unlink "a_file") || die "snafu"; unlink("a_file") || die "snafu"; See the Perl man page's section on Precedence for more gory details. 10) How come my converted awk/sed/sh script runs more slowly in Perl? The natural way to program in those languages may not make for the fastest Perl code. Notably, the awk-to-perl translator produces sub-optimal code; see the a2p man page for tweaks you can make. How complex are your regexps? Deeply nested sub-expressions with {n,m} or * operators can take a very long time to compute. Don't use ()'s unless you really need them. Anchor your string to the front if you can. Something like this next unless /^.*%.*$/; runs more slowly than the equivalent: next unless /%/; Note that this: next if /Mon/; next if /Tue/; next if /Wed/; next if /Thu/; next if /Fri/; runs faster than this: next if /Mon/ || /Tue/ || /Wed/ || /Thu/ || /Fri/; which in turn runs faster than this: next if /Mon|Tue|Wed|Thu|Fri/; which runs *much* faster than: next if /(Mon|Tue|Wed|Thu|Fri)/; Remember that a printf costs more than a simple print. Another thing to look at is your loops. Are you iterating through indexed arrays rather than just putting everything into a hashed array? For example, @list = ('abc', 'def', 'ghi', 'jkl', 'mno', 'pqr', 'stv'); for $i ($[ .. $#list) { if ($pattern eq $list[$i]) { $found++; } } First of all, it would be faster to use Perl's foreach mechanism instead of using subscripts: foreach $elt (@list) { if ($pattern eq $elt) { $found++; } } Better yet, this could be sped up dramatically by placing the whole thing in an associative array like this: %list = ('abc', 1, 'def', 1, 'ghi', 1, 'jkl', 1, 'mno', 1, 'pqr', 1, 'stv', 1 ); $found = $list{$pattern}; (but put the %list assignment outside of your input loop.) You should also look at variables in regular expressions, which is expensive . If the variable to be interpolated doesn't change over the life of the process, use the /o modifier to tell Perl to compile the regexp only once, like this: for $i (1..100) { if (/$foo/o) { do some_func($i); } } Finally, if you have a bunch of patterns in a list that you'd like to compare against, instead of doing this: @pats = ('_get.*', 'bogus', '_read', '.*exit'); foreach $pat (@pats) { if ( $name =~ /^$pat$/ ) { do some_fun(); last; } } If you build your code and then eval it, it will be much faster. For example: @pats = ('_get.*', 'bogus', '_read', '.*exit', '_write'); $code = <;" work right? Well, it does. The thing to remember is that local() provides an array context, an that the syntax in an array context will read all the lines in a file. To work around this, use: local($foo); $foo = ; If you are at a recent patchlevel, you can use the scalar() operator to cast the expression into a scalar context: local($foo) = scalar(); 16) How can I detect keyboard input without reading it? You might check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. It's very system dependent. Here's one solution that works on BSD systems: sub key_ready { local($rin, $nfd); vec($rin, fileno(STDIN), 1) = 1; return $nfd = select($rin,undef,undef,0); } 17) How can I make an array of arrays? You can use the multi-dimensional array emulation of $a{'x','y','z'}, or you can make an array of names of arrays and eval it. For example, if @name contains a list of names of arrays, you can get at a the j-th element of the i-th array like so: $ary = $name[$i]; $val = eval "\$$ary[$j]"; or in one line $val = eval "\$$name[$i][\$j]"; You could also use the type-globbing syntax to make an array of *name values, which will be more efficient than eval. For example: { local(*ary) = $name[$i]; $val = $ary[$j]; } 18) How can I quote a variable to use in a regexp? From the manual: $pattern =~ s/(\W)/\\$1/g; Now you can freely use /$pattern/ without fear of any unexpected meta-characters in it throwing off the search. 19) Why do setuid Perl scripts complain about kernel problems? This message: YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET! FIX YOUR KERNEL, PUT A C WRAPPER AROUND THIS SCRIPT, OR USE -u AND UNDUMP! is triggered because setuid scripts are inherently insecure due to a kernel bug. If your system has fixed this bug, you can compile Perl so that it knows this. Otherwise, create a setuid C program that just execs Perl with the name of the script. 20) How do I open a pipe both to and from a command? In general, this is a dangerous move because you can find yourself in deadlock situation. It's better to put one end of the pipe to a file. For example: # first write some_cmd's input into a_file, then open(CMD, "some_cmd its_args < a_file |"); while () { # or else the other way; run the cmd open(CMD, "| some_cmd its_args > a_file"); while ($condition) { print CMD "some output\n"; # other code deleted } close CMD || warn "cmd exited $?"; # now read the file open(FILE,"a_file"); while () { At the risk of deadlock, it is possible to use a fork, two pipe calls, and an exec to manually set up the two-way pipe. If you have ptys, you could arrange to run the command on a pty and avoid the deadlock problem. 21) How can I change the first N letters of a string? Remember that the substr() function produces an lvalue, that is, it may be assigned to. Therefore, to change the first character to an S, you could do this: substr($var,0,1) = 'S'; This assumes that $[ is 0; for a library routine where you can't know $[, you should use this instead: substr($var,$[,1) = 'S'; While it would be slower, you could in this case use a substitute: $var =~ s/^./S/; But this won't work if the string is empty or its first character is a newline, which "." will never match. So you could use this instead: $var =~ s/^[^\0]?/S/; To do things like translation of the first part of a string, use substr, as in: substr($var, $[, 10) =~ tr/a-z/A-Z/; If you don't know then length of what to translate, something like this works: /^(\S+)/ && substr($_,$[,length($1)) =~ tr/a-z/A-Z/; For some things it's convenient to use the /e switch of the substitute operator: s/^(\S+)/($tmp = $1) =~ tr#a-z#A-Z#, $tmp/e although in this case, it runs slower than the previous example. Brought to you by Super Global Mega Corp .com