There are two major phases in the execution of a program run by Perl, which you sometimes see as “compile time” and “run time”, or sometimes now, “compile phase” and “run phase”. In the broadest of strokes, perl
compiles code in the compile phase, and when it’s completely done with that, it moves on to the run phase, where it executes the code that it completely compiled.
That’s really too simple of a model, though, even if it is a good starting point. perl
can also run code in the compile phase and compile code in the run phases. Along with that, perl
has various interstitial phases where you can do more work. Each of these phases and sub-phases has a special subroutine and you can define each of them as many times as you like. Each also executes in a particular order relative to other definitions of the same subroutines. These are documnted in perlmod. People sometimes call these scheduled blocks:
Subroutine | Description | Execution order |
---|---|---|
BEGIN | Compile its code and run immediately | FIFO |
CHECK | Run right after the main compile phase and before the main run phase. | LIFO |
UNITCHECK | Runs right after the main compile phase pass on each compilation unit, which is probably a file. | LIFO |
INIT | Run right before the main run phase starts | FIFO |
END | Compile the code and run at program end | LIFO |
Some of these execute in the order that you define them, or first in, first out (FIFO), while others execute in reverse order of definition, or last in, first out (LIFO). In general, the subroutines that happen right away or the start of a phase, such as BEGIN
and INIT
, are FIFO. The ones that happen at the end, such as CHECK
, UNITCHECK
, and END
, are LIFO.
Looking at it from top to bottom, the subroutines run in this order:
Initial designs
So far, perl
can’t save the state of its compilation, so you can’t merely compile a Perl program and run it later, perhaps on another machine, like you can do with Java, Python, or Ruby. It’s a glaring omission, but also a technical limitation based on the other features you get from Perl.
The CHECK
gave Perl’s compiler toolkit a place to stop and perhaps examine or save its state after it had compiled everything in its main phase. However, this still wouldn’t include modules that you require
(so, load and compile at run time) or code that you load with a string eval
. As such, in general, you can never pre-compile every Perl program, which is almost the same as saying it’s a broken feature.
The UNITCHECK
subroutine was designed to do the same thing, but to save the compilation of each file. There’s a little known feature of Perl, probably deservedly so, of the .pmc file that is a pre-compiled module file. perl
will look for the .pmc version before it tries to load the .pm version. That means that perl
could save itself the work of compiling that module, which might be substantial. This feature hasn’t worked out so well, just yet.
Inadvertant execution
Since some of these subroutines run at compile time, that means that you might inadvertantly run code when you don’t think anything will happen, such as under the -c
for a syntax check. This program is innocent and doesn’t run any code:
$ perl -c 'print qq|Hello compiler!\n|'; -e syntax OK
These one-liners, however, have the subroutines that execute during the compile phase, so each compiles its code and run it even though it never enters the main run phase:
$ perl -c -e 'BEGIN { print qq|Hello compiler!\n| }'; Hello compiler! -e syntax OK $ perl -c -e 'CHECK { print qq|Hello compiler!\n| }'; Hello compiler! -e syntax OK $ perl -c -e 'UNITCHECK { print qq|Hello compiler!\n| }'; Hello compiler! -e syntax OK
This means that someone might be able to provide you a module or library that, maliciously or otherwise, runs code that you might not want to run.
This is not true for the INIT
or END
blocks, since they run in the main run phase.
$ perl -c -e 'END { print qq|Hello compiler!\n| }'; -e syntax OK $ perl -c -e 'INIT { print qq|Hello compiler!\n| }'; -e syntax OK
This also matters for files that you load. Create a small module that has each of these subroutines:
package Phases; use 5.012; BEGIN { say 'Ran BEGIN' } UNITCHECK { say 'Ran UNITCHECK' } CHECK { say 'Ran CHECK' } INIT { say 'Ran INIT' } END { say 'Ran END' } 1;
By merely loading this module, you run some code. This simple module is in the current work:
$ perl5.14.1 -c -E 'use Phases;' Ran BEGIN Ran UNITCHECK Ran CHECK -e syntax OK
This also happens if you load the module with -M
switch:
$ perl5.14.1 -c -MPhases -E '1' Ran BEGIN Ran UNITCHECK Ran CHECK -e syntax OK
Even though you explicitly used the -c
switch, but you might also have this problem with your editor if it actually does syntax checks for you. This means that someone might be able to trick you into running code that you never know about. This hasn’t been a huge problem for Perlers, but it is possible.
Black magic
If you want to change these subroutines dynamically, you can use the Devel::Hook module to insert them through code rather than having to type them out literally. Since these scheduled blocks are really just arrays of code references, you might want to change around those arrays. Here’s the example from the module’s documentation:
use Devel::Hook (); INIT { print "INIT #2\n"; } BEGIN { Devel::Hook->push_INIT_hook( sub { print "INIT #3 (hook)\n" } ); Devel::Hook->unshift_INIT_hook( sub { print "INIT #1 (hook)\n" } ); } print "RUNTIME\n";
It provides methods to add or remove some of the code references, whether from the front or back of the array. So far, it doesn’t supply a way to splice those internal arrays.
Things to remember
BEGIN
blocks compile and run immediatelyUNITCHECK
blocks run at the end of the file’s compilation, in LIFO order.CHECK
blocks run at the end of the main compilation, in LIFO order.INIT
blocks run at the beginning of the run phase, in FIFO order.END
blocks run at the end of the run phase, in LIFO order.
It’d be great to have a tutorial on how these phases are actually *useful* – hint hint :D
If I find any uses for them, I’ll let you know. :)
I put in the UNITCHECK and it turns out to be less useful than expected, as when compiling a module you’re not just compiling the Perl code you’re also running it, the module isn’t compiled until the final 1; sings. I tried to make use of UNITCHECK to automatically namespace clean and immutablise Moose classes, but you need to do this later than UNITCHECK. I sometimes think Perl needs a UNITEND for this, that both the current unit, and any child units can insert code blocks into.
I was thinking that would be useful for namespace cleansing, but maybe it’s not. Did you try that in a regular CHECK block?
And whats about require/eval ?
I’ve stuck with Perl attributes. Looks like attributes doesn’t work if you _require_ module which uses attributes.
I have no answer for that because I’m not going to subject myself to the pain of attributes (or Catalyst). If you find the answer let me know!