[UPDATE: this is not a problem in v5.18 and later.]
In Item 33: “Watch out for match variables”, you found out that the match variable $`
, $&
, and $`
come with a performance hit. With all of the module code that you might use, you might be using those variables even though you didn’t code with them yourself.
There’s a module that can tell you if anything in your program used one of these nasty variables. The Devel::SawAmpersand module uses the B
backend to look for any compiled Perl code that deals with the internal sawampersand
variable that, when true, automatically slows down all of your matches and substitutions. By inserting a couple of lines in at the top of your code, you can tell if Perl
ran into sawampersand
:
use Devel::SawAmpersand qw(sawampersand); END { print "saw ampersand => ", sawampersand(), "\n" } my $string = 'cat bird dog'; $string =~ m/\s*bird\s*/; print <<"HERE"; \$` => $` \$& => $& \$' => $' HERE
The output shows you that it did indeed see at least one of those variables:
$` => cat $& => bird $' => dog saw ampersand => 1
That’s not very useful because it doesn’t tell you where it found the variables, and you had to change the source to find out. You really want to do this without changing the source and also have it tell you in which files and on which line numbers those variables appear. That’s what Devel::FindAmpersand does:
my $string = 'cat bird dog'; $string =~ m/\s*bird\s*/; print <<"HERE"; \$` => $` \$& => $& \$' => $' HERE
Now I run the program by loading the Devel::FindAmpersand
module on the command line (in this case, reporting the line number where the string starts):
$ perl5.10.1 -MDevel::FindAmpersand amp.pl $` => cat $& => bird $' => dog Found evil variable $` in file amp.pl, line 5 Found evil variable $& in file amp.pl, line 5 Found evil variable $' in file amp.pl, line 5
That’s fine for finding the variables in the same file, but what if they are in a different module? Devel::FindAmpersand
only reports what it finds in the main script. Here’s a script that pulls in a library that uses one of the match variables:
require 'uses-amp.pl'; my $string = 'cat bird dog'; $string =~ m/\s*bird\s*/; print "Pre-match is $PREMATCH\n";
Here’s the tiny culprit library:
# this is a naughty module my $matched = $&; 1;
Devel::FindAmpersand
doesn’t complain though, since the $&
doesn’t appear in the main file:
$ perl -MDevel::FindAmpersand amp.pl Pre-match is cat
Since you’re really only doing this during development and probably infrequently, you can rig a solution to search through any loaded files. Create a small module that uses an END
block to go through all of the files listed in %INC
and examine each individually:
# ScanAmpersand.pm END { foreach my $file ( values %INC ) { system $^X, '-MDevel::FindAmpersand', $file; } } 1;
When you use your ScanAmpersand
, you get warnings from any files that used one of the variables:
$ perl -MScanAmpersand amp.pl Pre-match is cat Found evil variable $& in file uses-amp.pl, line 3
Workarounds
Now that you’ve found the naughty uses of $`
, $&
, and $`
, you need to fix up the code to remove them. If you are using Perl 5.10 or later, use the per-match variables instead (Item 33: Watch out for match variables). If you are using a version earlier than Perl 5.10, you might modify the regular expression to explicitly capture parts of it. The Devel::SawAmpersand
documentation gives you these possibilities:
Naughty | Nice |
$` of /pattern/ |
$1 of /(.*?)pattern/s |
$& of /pattern/ |
$1 of /(pattern)/ |
$' of /pattern/ |
$+ of /pattern(.*)/s |
Things to remember
- The match variables
$`
,$&
, and$`
suffer performance hits - You don’t have to use them directly to suffer
- Use
Devel::FindAmpersand
to trach down their use.
I’ve been meaning to do this for a while. Your post has finally spurred me into action… the next version of NYTProf will report the place that it first noticed that the slow match variables had been seen. (Typically that’ll be the file that uses one of them.)
And Tim has made it so: NYTProf 4.04 — Came, Saw Ampersand, and Conquered.