[Chapter 1] 1.6 Control Structures

1.6 Control Structures

So far, except for our one large example, all of our examples have been completely linear; we executed each command in order. We've seen a few examples of using the short circuit operators to cause a single command to be (or not to be) executed. While you can write some very useful linear programs (a lot of CGI scripts fall into this category), you can write much more powerful programs if you have conditional expressions and looping mechanisms. Collectively, these are known as control structures. So you can also think of Perl as a control language.

But to have control, you have to be able to decide things, and to decide things, you have to know the difference between what's true and what's false.

What Is Truth?

We've bandied about the term truth,[23] and we've mentioned that certain operators return a true or a false value. Before we go any further, we really ought to explain exactly what we mean by that. Perl treats truth a little differently than most computer languages, but after you've worked with it awhile it will make a lot of sense. (Actually, we're hoping it'll make a lot of sense after you've read the following.)

[23] Strictly speaking, this is not true.

Basically, Perl holds truths to be self-evident. That's a glib way of saying that you can evaluate almost anything for its truth value. Perl uses practical definitions of truth that depend on the type of thing you're evaluating. As it happens, there are many more kinds of truth than there are of nontruth.

Truth in Perl is always evaluated in a scalar context. (Other than that, no type coercion is done.) So here are the rules for the various kinds of values that a scalar can hold:

Any string is true except for "" and "0".
Any number is true except for 0.
Any reference is true.
Any undefined value is false.

Actually, the last two rules can be derived from the first two. Any reference (rule 3) points to something with an address, and would evaluate to a number or string containing that address, which is never 0. And any undefined value (rule 4) would always evaluate to 0 or the null string.

And in a way, you can derive rule 2 from rule 1 if you pretend that everything is a string. Again, no coercion is actually done to evaluate truth, but if a coercion to string were done, then any numeric value of 0 would simply turn into the string "0", and be false. Any other number would not turn into the string "0", and so would be true. Let's look at some examples so we can understand this better:

0          # would become the string "0", so false
1          # would become the string "1", so true
10 - 10    # 10-10 is 0, would convert to string "0", so false
0.00       # becomes 0, would convert to string "0", so false
"0"        # the string "0", so false
""         # a null string, so false
"0.00"     # the string "0.00", neither empty nor exactly "0", so true
"0.00" + 0 # the number 0 (coerced by the +), so false.
\$a        # a reference to $a, so true, even if $a is false
undef()    # a function returning the undefined value, so false

Since we mumbled something earlier about truth being evaluated in a scalar context, you might be wondering what the truth value of a list is. Well, the simple fact is, there is no operation in Perl that will return a list in a scalar context. They all return a scalar value instead, and then you apply the rules of truth to that scalar. So there's no problem, as long as you can figure out what any given operator will return in a scalar context.

The if and unless statements

We saw earlier how a logic operator could function as a conditional. A slightly more complex form of the logic operators is the if statement. The if statement evaluates a truth condition, and executes a block if the condition is true.

A block is one or more statements grouped together by a set of braces. Since the if statement executes a block, the braces are required by definition. If you know a language like C, you'll notice that this is different. Braces are optional in C if you only have a single line of code, but they are not optional in Perl.

if ($debug_level > 0) {
    # Something has gone wrong.  Tell the user.
    print "Debug: Danger, Will Robinson, danger!\n";
    print "Debug: Answer was '54', expected '42'.\n";
}

Sometimes, just executing a block when a condition is met isn't enough. You may also want to execute a different block if that condition isn't met. While you could certainly use two if statements, one the negation of the other, Perl provides a more elegant solution. After the block, if can take an optional second condition, called else, to be executed only if the truth condition is false. (Veteran computer programmers will not be surprised at this point.)

Other times, you may even have more than two possible choices. In this case, you'll want to add an elsif truth condition for the other possible choices. (Veteran computer programmers may well be surprised by the spelling of "elsif", for which nobody here is going to apologize. Sorry.)

if ($city eq "New York") {
    print "New York is northeast of Washington, D.C.\n";
}
elsif ($city eq "Chicago") {
    print "Chicago is northwest of Washington, D.C.\n";
}
elsif ($city eq "Miami") {
    print "Miami is south of Washington, D.C.  And much warmer!\n";
}
else {
    print "I don't know where $city is, sorry.\n";
}

The if and elsif clauses are each computed in turn, until one is found to be true or the else condition is reached. When one of the conditions is found to be true, its block is executed and all the remaining branches are skipped. Sometimes, you don't want to do anything if the condition is true, only if it is false. Using an empty if with an else may be messy, and a negated if may be illegible; it sounds weird to say "do something if not this is true". In these situations, you would use the unless statement.

unless ($destination eq $home) {
    print "I'm not going home.\n";
}

There is no "elsunless" though. This is generally construed as a feature.

Iterative (Looping) Constructs

Perl has four main iterative statement types: while, until, for, and foreach. These statements allow a Perl program to repeatedly execute the same code for different values.

The while and until statements

The while and until statements function similarly to the if and unless statements, in a looping fashion. First, the conditional part of the statement is checked. If the condition is met (if it is true for a while or false for an until) the block of the statement is executed.

while ($tickets_sold < 10000) {
    $available = 10000 - $tickets_sold;
    print "$available tickets are available.  How many would you like: ";
    $purchase = <STDIN>;
    chomp($purchase);
    $tickets_sold += $purchase;
}

Note that if the original condition is never met, the loop will never be entered at all. For example, if we've already sold 10,000 tickets, we might want to have the next line of the program say something like:

print "This show is sold out, please come back later.\n";

In our grade example earlier, line 4 reads:

while ($line = <GRADES>) {

This assigns the next line to the variable $line, and as we explained earlier, returns the value of $line so that the condition of the while statement can evaluate $line for truth. You might wonder whether Perl will get a false negative on blank lines and exit the loop prematurely. The answer is that it won't. The reason is clear, if you think about everything we've said. The line input operator leaves the newline on the end of the string, so a blank line has the value "\n". And you know that "\n" is not one of the canonical false values. So the condition is true, and the loop continues even on blank lines.

On the other hand, when we finally do reach the end of the file, the line input operator returns the undefined value, which always evaluates to false. And the loop terminates, just when we wanted it to. There's no need for an explicit test against the eof function in Perl, because the input operators are designed to work smoothly in a conditional context.

In fact, almost everything is designed to work smoothly in a conditional context. For instance, an array in a scalar context returns its length. So you often see:

while (@ARGV) {
    process(shift @ARGV);
}

The loop automatically exits when @ARGV is exhausted.

The for statement

Another iterative statement is the for loop. A for loop runs exactly like the while loop, but looks a good deal different. (C programmers will find it very familiar though.)

for ($sold = 0; $sold < 10000; $sold += $purchase) {
    $available = 10000 - $sold;
    print "$available tickets are available.  How many would you like: ";
    $purchase = <STDIN>;
    chomp($purchase);
}

The for loop takes three expressions within the loop's parentheses: an expression to set the initial state of the loop variable, a condition to test the loop variable, and an expression to modify the state of the loop variable. When the loop starts, the initial state is set and the truth condition is checked. If the condition is true, the block is executed. When the block finishes, the modification expression is executed, the truth condition is again checked, and if true, the block is rerun with the new values. As long as the truth condition remains true, the block and the modification expression will continue to be executed.

The foreach statement

The last of Perl's main iterative statements is the foreach statement. foreach is used to execute the same code for each of a known set of scalars, such as an array:

foreach $user (@users) {
    if (-f "$home{$user}/.nexrc") {
        print "$user is cool... they use a perl-aware vi!\n";
    }
}

In a foreach statement, the expression in parentheses is evaluated to produce a list. Then each element of the list is aliased to the loop variable in turn, and the block of code is executed once for each element. Note that the loop variable becomes a reference to the element itself, rather than a copy of the element. Hence, modifying the loop variable will modify the original array.

You find many more foreach loops in the typical Perl program than for loops, because it's very easy in Perl to generate the lists that foreach wants to iterate over. A frequently seen idiom is a loop to iterate over the sorted keys of a hash:

foreach $key (sort keys %hash) {

In fact, line 9 of our grade example does precisely that.

Breaking out: next and last

The next and last operators allow you to modify the flow of your loop. It is not at all uncommon to have a special case; you may want to skip it, or you may want to quit when you encounter it. For example, if you are dealing with UNIX accounts, you may want to skip the system accounts (like root or lp). The next operator would allow you to skip to the end of your current loop iteration, and start the next iteration. The last operator would allow you to skip to the end of your block, as if your test condition had returned false. This might be useful if, for example, you are looking for a specific account and want to quit as soon as you find it.

foreach $user (@users) {
    if ($user eq "root" or $user eq "lp") {
        next;
    }
    if ($user eq "special") {
        print "Found the special account.\n";
        # do some processing
        last;
    }
}

It's possible to break out of multi-level loops by labeling your loops and specifying which loop you want to break out of. Together with statement modifiers (another form of conditional we haven't talked about), this can make for very readable loop exits, if you happen to think English is readable:

LINE: while ($line = <ARTICLE>) {
    last LINE if $line eq "\n"; # stop on first blank line
    next LINE if $line =~ /^#/; # skip comment lines
    # your ad here
}

You may be saying, "Wait a minute, what's that funny ^# thing there inside the leaning toothpicks? That doesn't look much like English." And you're right. That's a pattern match containing a regular expression (albeit a rather simple one). And that's what the next section is about. Perl is above all a text processing language, and regular expressions are at the heart of Perl's text processing.


Operators		Regular Expressions