Perl’s flip-flop operator, ..
, (otherwise known as the range operator in scalar context) is a simple way to choose a window on some data. It returns false until its lefthand side is true. Once the lefthand side is true, the flip-flop operator returns true until its righthand side is true. Once the righthand side is true, the flip flop operator returns false. That is, the lefthand side turns it on and the righthand side turns it off.
Start with a simple file that has START
and END
markers:
# input.txt Ignore this Ignore this too START Show this And this Also this END Don't show this Or this
You need to extract the lines between those two markers:
# flip-flop while( <> ) { say if /START/ .. /END/; }
The output shows the just the stuff between those markers:
% perl flip-flop input.txt START Show this And this Also this END
What if you make the file a bit more complicated so there is an extra matching window? Once the flip-flop operator goes back to false, it can turn to true once its lefthand side matches again. Here’s a file with two windows:
# input2.txt Ignore this Ignore this too START Show this And this Also this END Don't show this Or this START Show this again And this again Also this again END But ignore this
Now you get both windows of output:
% perl flip-flop input2.txt START Show this And this Also this END START Show this again And this again Also this again END
That’s fine, but it gets a bit more complicated when you try to use the same flip flop more than once when you don’t know its state. Modify the flip-flop
program so it goes through each file separately instead of combining all the files into the ARGV
filehandle:
foreach my $file ( @ARGV ) { open my $fh, '<', $file or die "Could not find $file\n"; while( <$fh> ) { say if /START/ .. /END/; } }
To watch it work (or not work, coming later), split the input2a.txt file into two separate files, each of which has its own window you want to extract:
# input2a.txt Ignore this Ignore this too START Show this And this Also this END Don't show this
# input2b.txt Or this START Show this again And this again Also this again END But ignore this
The output isn’t surprising and it looks the same as it did with the previous program:
% perl flip-flop input2a.txt input2b.txt START Show this And this Also this END START Show this again And this again Also this again END
However, it’s at this point that some people get confused. The flip-flip operator doesn’t care about which file you are looking at, what happened in the last file, and so on. To see it “break”, change input2a.txt to it doesn’t have the END
marker:
# input2a.txt Ignore this Ignore this too START Show this And this Also this Don't show this
Since input2a.txt doesn’t complete the window as you intended, the flip-flop, maintaining its state, is still true when it starts the second file:
% perl flip-flop input2a.txt input2b.txt START Show this And this Also this Don't show this # inputb.txt Or this START Show this again And this again Also this again END
The flip-flop maintains its global state. It doesn’t care about starting new loops, new iterations, or anything else. You might think that you could find that in a subroutine, but it’s not even safe there. Every flip-flop operator that perl
compiles has its own state, and perl
compiles a subroutine only once:
foreach my $file ( @ARGV ) { open my $fh, '<', $file or die "Could not find $file\n"; extract( $fh ); } sub extract { my( $fh ) = shift; while( <$fh> ) { print if /START/ .. /END/; # this is the same .. on every call } }
The output doesn’t change! The flip-flop doesn’t really care that it’s in a subroutine. It’s really just the same flip-flop like it was before.
So, if every flip-flop operator that perl
compiles has its own state, and you want a flip-flop operator with a new state, you just need to compile a new flip-flop for each iteration. That’s simple enough, kinda. This program won’t work because the subroutine reference is the same each time. When perl
compiles it, it knows that the anonymous subroutine is going to be the same each time so perl
reuses it:
foreach my $file ( @ARGV ) { open my $fh, '<', $file or die "Could not find $file\n"; make_extractor()->($fh); } sub make_extractor { sub { # only compiled once my( $fh ) = shift; while( <$fh> ) { print if /START/ .. /END/; } }; }
You can verify this by dumping the return value of make_extractor
:
# dump-subs.pl use Devel::Peek; my @subs = map { make_extractor() } 1 .. 3; print Dump( $_ ) foreach @subs; sub make_extractor { sub { # only compiled once my( $fh ) = shift; while( <$fh> ) { print if /START/ .. /END/; } }; }
You get the same subroutine each time, which means you get the same flip-flop each time:
% perl dump-subs.pl SV = RV(0x80f66c) at 0x80f660 REFCNT = 2 FLAGS = (ROK) RV = 0x81a4f0 SV = PVCV(0x80e4b8) at ... SV = RV(0x80f6fc) at 0x80f6f0 REFCNT = 2 FLAGS = (ROK) RV = 0x81a4f0 SV = PVCV(0x80e4b8) ... SV = RV(0x8030bc) at 0x8030b0 REFCNT = 2 FLAGS = (ROK) RV = 0x81a4f0 SV = PVCV(0x80e4b8) ...
You have to make each subroutine different somehow. The trick is to use a closure, which is a subroutine that references a lexical variable that has gone out of scope. In this case, you can enlist state
to keep track of how many flip-flop operators you make, and since each new anonymous subroutine needs to capture the value of $count
, perl
can’t reuse previous definitions. You force it to make a new subroutine:
# flip-flop use 5.010; foreach my $file ( @ARGV ) { open my $fh, '<', $file or die "Could not find $file\n"; make_extractor()->($fh); } sub make_extractor { state $count = 0; $count++; sub { my( $fh ) = shift; while( <$fh> ) { print "$count: $_" if /START/ .. /END/; } }; }
Now each file gets its own flip-flop. You can see where the first file ends (and is missing its marker) and the second file begins. Every file gets its own flip-flop:
% perl flip-flop input2a.txt input2b.txt 1: START 1: Show this 1: And this 1: Also this 1: Don't show this 2: START 2: Show this again 2: And this again 2: Also this again 2: END
For more information about flip-flops, see perlop’s entry for Range Operators.
Things to remember
- Every flip-flop maintains a global state
- Flip-flops are not scoped
- Create a new flip-flop by wrapping it in a closure
The glob operator (spelled variously glob(“*.c”) or ) is another place where perl stores “hidden” global state in the operator when it’s compiled. At http://stackoverflow.com/questions/2633447/why-doesnt-perl-file-glob-work-outside-of-a-loop-in-scalar-context/2634012#2634012 I used a similar trick of creating closures to encapsulate the state of a glob operator, so that it can be used outside of a single while loop. This is slightly useless (since you can just glob to a list instead) but at the same time it’s pretty cool.
Hi Hobbs, I visited the link you provided (in fact I voted up your comment!). There you have given two examples, the second one uses a closure, to refer to a lexical variable, but the first one just returns an anonymous subroutine, so that should have the same problem as the flip-flop operator here does with anonymous subroutines.
Would you please explain how “Flip-flops are not scoped”?
Does that mean you can’t do:
In short, it means each range you create maintains its global state and side effects. These don’t disappear when the scope you are in finishes, like a
my
variable would, for instance.