|
Welcome to Perl.
Perl is the Swiss Army chainsaw of scripting languages: powerful and adaptable. It was first developed by Larry Wall, a linguist working as a systems administrator for NASA in the late 1980s, as a way to make report processing easier. Since then, it has moved into a large number of roles: automating system administration, acting as glue between different computer systems; and, of course, being one of the most popular languages for CGI programming on the Web.
Why did Perl become so popular when the Web came along? Two reasons: First, most of what is being done on the Web happens with text, and is best done with a language that's designed for text processing. More importantly, Perl was appreciably better than the alternatives at the time when people needed something to use. C is complex and can produce security problems (especially with untrusted data), Tcl can be awkward and Python didn't really have a foothold.
It also didn't hurt that Perl is a friendly language. It plays well with your personal programming style. The Perl slogan is ``There's more than one way to do it,'' and that lends itself well to large and small problems alike.
In this first part of our series, you'll learn a few basics about Perl and see a small sample program.
In this series, I'm going to assume that you're using a Unix system and
that your Perl interpreter is located at /usr/local/bin/perl
. It's OK
if you're running Windows; most Perl code is platform-independent.
Take the following text and put it into a file called first.pl
:
#!/usr/local/bin/perl print "Hi there!\n";
(Traditionally, first programs are supposed to say Hello world!
,
but I'm an iconoclast.)
Now, run it with your Perl interpreter. From a command line, go to the
directory with this file and type perl first.pl
. You should see:
Hi there!
The \n
indicates the ``newline'' character; without it, Perl doesn't
skip to a new line of text on its own.
Perl has a rich library of functions. They're the verbs of Perl, the commands that the interpreter runs. You can see a list of all the built-in functions on the perlfunc main page. Almost all functions can be given a list of parameters, which are separated by commas.
The print
function is one of the most frequently used parts of
Perl. You use it to display things on the screen or to send
information to a file (which we'll discuss in the next article). It
takes a list of things to output as its parameters.
print "This is a single statement."; print "Look, ", "a ", "list!";
A Perl program consists of statements, each of which ends with a semicolon. Statements don't need to be on separate lines; there may be multiple statements on one line or a single statement can be split across multiple lines.
print "This is "; print "two statements.\n"; print "But this ", "is only one statement.\n";
There are two basic data types in Perl: numbers and strings.
Numbers are easy; we've all dealt with them. The only thing you need to know is that you never insert commas or spaces into numbers in Perl. always write 10000, not 10,000 or 10 000.
Strings are a bit more complex. A string is a collection of characters in either single or double quotes:
'This is a test.' "Hi there!\n"
The difference between single quotes and double quotes is that single
quotes mean that their contents should be taken literally, while double
quotes mean that their contents should be interpreted. For example,
the character sequence \n
is a newline character when it appears in a
string with double quotes, but is literally the two characters, backslash
and n
, when it appears in single quotes.
print "This string\nshows up on two lines."; print 'This string \n shows up on only one.';
(Two other useful backslash sequences are \t
to insert a tab character,
and \\
to insert a backslash into a double-quoted string.)
If functions are Perl's verbs, then variables are its nouns. Perl has three types of variables: scalars, arrays and hashes. Think of them as ``things,'' ``lists,'' and ``dictionaries.'' In Perl, all variable names are a punctuation character, a letter or underscore, and one or more alphanumeric characters or underscores.
Scalars are single things. This might be a number or a string.
The name of a scalar begins with a dollar sign, such as $i
or
$abacus
. You assign a value to a scalar by telling Perl what it
equals, like so:
$i = 5; $pie_flavor = 'apple'; $constitution1776 = "We the People, etc.";
You don't need to specify whether a scalar is a number or a string. It doesn't matter, because when Perl needs to treat a scalar as a string, it does; when it needs to treat it as a number, it does. The conversion happens automatically. (This is different from many other languages, where strings and numbers are two separate data types.)
If you use a double-quoted string, Perl will insert the value of any scalar variables you name in the string. This is often used to fill in strings on the fly:
$apple_count = 5; $count_report = "There are $apple_count apples."; print "The report is: $count_report\n";
The final output from this code is The report is: There are 5 apples.
.
Numbers in Perl can be manipulated with the usual mathematical
operations: addition, multiplication, division and subtraction.
(Multiplication and division are indicated in Perl with the *
and
/
symbols, by the way.)
$a = 5; $b = $a + 10; # $b is now equal to 15. $c = $b * 10; # $c is now equal to 150. $a = $a - 1; # $a is now 4, and algebra teachers are cringing.
You can also use special operators like ++
, --
, +=
, -=
, /=
and *=
.
These manipulate a scalar's value without needing two elements in an
equation. Some people like them, some don't. I like the fact that they
can make code clearer.
$a = 5; $a++; # $a is now 6; we added 1 to it. $a += 10; # Now it's 16; we added 10. $a /= 2; # And divided it by 2, so it's 8.
Strings in Perl don't have quite as much flexibility. About the only basic operator that you can use on strings is concatenation, which is a $10 way of saying ``put together.'' The concatenation operator is the period. Concatenation and addition are two different things:
$a = "8"; # Note the quotes. $a is a string. $b = $a + "1"; # "1" is a string too. $c = $a . "1"; # But $b and $c have different values!
Remember that Perl converts strings to numbers transparently whenever
it's needed, so to get the value of $b
, the Perl interpreter converted
the two strings "8"
and "1"
to numbers, then added them. The value of
$b
is the number 9. However, $c
used concatenation, so its value is
the string "81"
.
Just remember, the plus sign adds numbers and the period puts strings together.
Arrays are lists of scalars. Array names begin with @
. You define
arrays by listing their contents in parentheses, separated by commas:
@lotto_numbers = (1, 2, 3, 4, 5, 6); # Hey, it could happen. @months = ("July", "August", "September");
The contents of an array are indexed beginning with 0. (Why not 1?
Because. It's a computer thing.) To retrieve the elements of an array,
you replace the @
sign with a $
sign, and follow that with the
index position of the element you want. (It begins with a dollar sign
because you're getting a scalar value.) You can also modify it in place,
just like any other scalar.
@months = ("July", "August", "September"); print $months[0]; # This prints "July". $months[2] = "Smarch"; # We just renamed September!
If an array doesn't exist, by the way, you'll create it when you try to assign a value to one of its elements.
$winter_months[0] = "December"; # This implicitly creates @winter_months.
Arrays always return their contents in the same order; if you go
through @months
from beginning to end, no matter how many times
you do it, you'll get back July
, August
and September
in that order.
If you want to find the length
of an array, use the value $#array_name
. This is one less than
the number of elements in the array. If the array just doesn't exist
or is empty, $#array_name
is -1. If you want to resize an array,
just change the value of $#array_name
.
@months = ("July", "August", "September"); print $#months; # This prints 2. $a1 = $#autumn_months; # We don't have an @autumn_months, so this is -1. $#months = 0; # Now @months only contains "July".
Hashes are called ``dictionaries'' in some programming languages,
and that's what they are: a term and a definition, or in more correct
language a key and a value. Each key in a hash has one and only one
corresponding value. The name of a hash begins with a percentage sign, like %parents
. You define hashes by comma-separated pairs of
key and value, like so:
%days_in_month = ( "July" => 31, "August" => 31, "September" => 30 );
You can fetch any value from a hash by referring to $hashname{key}
,
or modify it in place just like any other scalar.
print $days_in_month{"September"}; # 30, of course. $days_in_month{"February"} = 29; # It's a leap year.
If you want to see what keys are in a hash, you can use the keys
function with the name of the hash. This returns a list containing
all of the keys in the hash. The list isn't always in the same order,
though; while we could count on @months
to always return
July
, August
, September
in that order, keys %days_in_summer
might
return them in any order whatsoever.
@month_list = keys %days_in_summer; # @month_list is now ('July', 'September', 'August') !
The three types of variables have three separate namespaces. That
means that $abacus
and @abacus
are two different variables, and
$abacus[0]
(the first element of @abacus
) is not the same as
$abacus{0}
(the value in %abacus
that has the key 0
).
Notice that in some of the code samples from the previous section, I've used code comments. These are useful for explaining what a particular piece of code does, and vital for any piece of code you plan to modify, enhance, fix, or just look at again. (That is to say, comments are vital for all code.)
Anything in a line of Perl code that follows a # sign is a comment. (Except, of course, if the # sign appears in a string.)
print "Hello world!\n"; # That's more like it. # This entire line is a comment.
Almost every time you write a program, you'll need to use a loop. Loops allow you run a particular piece of code over and over again. This is part of a general concept in programming called flow control.
Perl has several different functions that are useful for flow control,
the most basic of which is for
. When you use the for
function,
you specify a variable that will be used for the loop index, and a
list of values to loop over. Inside a pair of curly brackets, you put
any code you want to run during the loop:
for $i (1, 2, 3, 4, 5) { print "$i\n"; }
This loop prints the numbers 1 through 5, each on a separate line.
A handy shortcut for defining loops is using ..
to specify a range
of numbers. You can write (1, 2, 3, 4, 5) as (1 .. 5). You can also use
arrays and scalars in your loop list. Try this code and see what happens:
@one_to_ten = (1 .. 10); $top_limit = 25; for $i (@one_to_ten, 15, 20 .. $top_limit) { print "$i\n"; }
The items in your loop list don't have to be numbers; you can use strings
just as easily. If the hash %month
_has contains names of months and
the number of days in each month, you can use the keys function to step
through them.
for $i (keys %month_has) { print "$i has $month_has{$i} days.\n"; }
for $marx ('Groucho', 'Harpo', 'Zeppo', 'Karl') { print "$marx is my favorite Marx brother.\n"; }
You now know enough about Perl - variables, print, and for()
- to write a small, useful program. Everyone loves money, so the first sample
program is a compound-interest calculator. It will print a (somewhat)
nicely formatted table showing the value of an investment over a number
of years. (You can see the program at compound_interest.pl)
The single most complex line in the program is this one:
$interest = int (($apr / 100) * $nest_egg * 100) / 100;
$apr / 100
is the interest rate, and ($apr / 100) * $nest_egg
is
the amount of interest earned in one year. This line uses the
int()
function, which returns the integer value of a scalar (its
value after any fractional part has been stripped off). We use
int()
here because when you multiply, for example, 10925 by
9.25%, the result is 1010.5625, which we must round off to
1010.56. To do this, we multiply by 100, yielding 101056.25, use
int()
to throw away the leftover fraction, yielding 101056, and
then divide by 100 again, so that the final result is 1010.56. Try
stepping through this statement yourself to see just how we end up
with the correct result, rounded to cents.
At this point you have some basic knowledge of Perl syntax and a few
simple toys to play with - print
, for()
, keys()
, and int()
.
Try writing some simple programs with them. Here are two suggestions,
one simple and the other a little more complex:
\n
to go to a new line.)