Book Home Programming PerlSearch this book

Chapter 18. Compiling

Contents:

The Life Cycle of a Perl Program
Compiling Your Code
Executing Your Code
Compiler Backends
Code Generators
Code Development Tools
Avant-Garde Compiler, Retro Interpreter

If you came here looking for a Perl compiler, you may be surprised to discover that you already have one--your perl program (typically /usr/bin/perl) already contains a Perl compiler. That might not be what you were thinking, and if it wasn't, you may be pleased to know that we do also provide code generators (which some well-meaning folks call "compilers"), and we'll discuss those toward the end of this chapter. But first we want to talk about what we think of as The Compiler. Inevitably there's going to be a certain amount of low-level detail in this chapter that some people will be interested in, and some people will not. If you find that you're not, think of it as an opportunity to practice your speed-reading skills.

Imagine that you're a conductor who's ordered the score for a large orchestral work. When the box of music arrives, you find several dozen booklets, one for each member of the orchestra with just their part in it. But curiously, your master copy with all the parts is missing. Even more curiously, the parts you do have are written out using plain English instead of musical notation. Before you can put together a program for performance, or even give the music to your orchestra to play, you'll first have to translate the prose descriptions into the normal system of notes and bars. Then you'll need to compile the individual parts into one giant score so that you can get an idea of the overall program.

Similarly, when you hand the source code of your Perl script over to perl to execute, it is no more useful to the computer than the English description of the symphony was to the musicians. Before your program can run, Perl needs to compile[1] these English-looking directions into a special symbolic representation. Your program still isn't running, though, because the compiler only compiles. Like the conductor's score, even after your program has been converted to an instruction format suitable for interpretation, it still needs an active agent to interpret those instructions.

[1]Or translate, or transform, or transfigure, or transmute, or transmogrify.

18.1. The Life Cycle of a Perl Program

You can break up the life cycle of a Perl program into four distinct phases, each with separate stages of its own. The first and the last are the most interesting ones, and the middle two are optional. The stages are depicted in Figure 18-1.

Figure 18.1. The life cycle of a Perl program

  1. The Compilation Phase

    During phase 1, the compile phase, the Perl compiler converts your program into a data structure called a parse tree. Along with the standard parsing techniques, Perl employs a much more powerful one: it uses BEGIN blocks to guide further compilation. BEGIN blocks are handed off to the interpreter to be run as as soon as they are parsed, which effectively runs them in FIFO order (first in, first out). This includes any use and no declarations; these are really just BEGIN blocks in disguise. Any CHECK, INIT, and END blocks are scheduled by the compiler for delayed execution.

    Lexical declarations are noted, but assignments to them are not executed. All evalBLOCKs, s///e constructs, and noninterpolated regular expressions are compiled here, and constant expressions are pre-evaluated. The compiler is now done, unless it gets called back into service later. At the end of this phase, the interpreter is again called up to execute any scheduled CHECK blocks in LIFO order (last in, first out). The presence or absence of a CHECK block determines whether we next go to phase 2 or skip over to phase 4.

  2. The Code Generation Phase (optional)

    CHECK blocks are installed by code generators, so this optional phase occurs when you explicitly use one of the code generators (described later in "Code Generators"). These convert the compiled (but not yet run) program into either C source code or serialized Perl bytecodes--a sequence of values expressing internal Perl instructions. If you choose to generate C source code, it can eventually produce a file called an executable image in native machine language.[2]

    [2] Your original script is an executable file too, but it's not machine language, so we don't call it an image. An image file is called that because it's a verbatim copy of the machine codes your CPU knows how to execute directly.

    At this point, your program goes into suspended animation. If you made an executable image, you can go directly to phase 4; otherwise, you need to reconstitute the freeze-dried bytecodes in phase 3.

  3. The Parse Tree Reconstruction Phase (optional)

    To reanimate the program, its parse tree must be reconstructed. This phase exists only if code generation occurred and you chose to generate bytecode. Perl must first reconstitute its parse trees from that bytecode sequence before the program can run. Perl does not run directly from the bytecodes; that would be slow.

  4. The Execution Phase

    Finally, what you've all been waiting for: running your program. Hence, this is also called the run phase. The interpreter takes the parse tree (which it got either directly from the compiler or indirectly from code generation and subsequent parse tree reconstruction) and executes it. (Or, if you generated an executable image file, it can be run as a standalone program since it contains an embedded Perl interpreter.)

    At the start of this phase, before your main program gets to run, all scheduled INIT blocks are executed in FIFO order. Then your main program is run. The interpreter can call back into the compiler as needed upon encountering an evalSTRING, a doFILE or require statement, an s///ee construct, or a pattern match with an interpolated variable that is found to contain a legal code assertion.

    When your main program finishes, any delayed END blocks are finally executed, this time in LIFO order. The very first one seen will execute last, and then you're done. (END blocks are skipped only if you exec or your process is blown away by an uncaught catastrophic error. Ordinary exceptions are not considered catastrophic.

Now we'll discuss these phases in greater detail, and in a different order.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.

This HTML Help has been published using the chm2web software.