book – Page 10 – The Effective Perler

Use lookarounds to eliminate special cases in split

The split built-in takes a string and turns it into a list, discarding the separators that you specify as a pattern. This is easy when the separator is simple, but seems hard if the separator gets more tricky. Continue reading “Use lookarounds to eliminate special cases in split”

Enchant closures for better debugging output

When you’re using code references heavily, you’re going to have a problem figuring out which one of them is having a problem. You define them in possibly several and far-flung parts of your program, but when it comes to using them, you don’t know which one you are using. You can’t really print its value like you would for a scalar, making it more difficult for you to debug things. You can dereference a scalar or an array to see what it is, but you can’t dereference a code reference without making it do something. Continue reading “Enchant closures for better debugging output”

Normalize your Perl source

Perl has had Unicode support since Perl 5.6, which means that most Perl tutorials have been bending the truth a bit when they tell you that a Perl identifier, the name that you give to variables, starts with [A-Za-z_] and continues with [0-9A-Za-z_]. With Unicode support, you have many more characters available to you, but moving outside the ASCII range has some problems. You can’t always tell what a variable name is just by looking at it (and this is a design bug in Perl: RT 96814). For instance, you don’t really don’t know what this variable is: Continue reading “Normalize your Perl source”

Intercept warnings with a WARN handler

Perl defines two internal pseudo-signals that you can trap. There’s one for die, which I covered in Override die with END or CORE::GLOBAL::die and eventually told you not to use. There’s also one for warn that’s quite safe to use when you need to intercept warnings. Continue reading “Intercept warnings with a __WARN__ handler”

Know the difference between utf8 and UTF-8

Perl actually has two encodings that get the letters u, t, f, and 8. One will happily let you do bad things, and the other will let you do bad things but with a warning that you can make fatal. Continue reading “Know the difference between utf8 and UTF-8”

Know the difference between character strings and UTF-8 strings

Normally, you shouldn’t have to care about a string’s encoding. Indeed, the abstract string has no encoding. It exists as an idea without a representation and it’s not until you want to put it on disk, send it down a pipe, or otherwise force it to exist as electrical pulses, magnetic pole orientation, and so on that you need to think of it in concrete terms. All stored data, even ASCII, has an encoding. Until you force it to have a bit pattern to live in the tangible world, you shouldn’t have to worry about anything like an encoding. Continue reading “Know the difference between character strings and UTF-8 strings”

Use a Task distribution to specify groups of modules

A Task distribution is like a normal Perl distribution in structure, but it doesn’t actually provide any code. It lists as pre-requisites all of the modules or distributions that you want to install so you can use a conventiional CPAN tool to install all of the dependencies. A Task is slightly different from the older way, a Bundle, but for most people and uses, a Task might be a better way. Continue reading “Use a Task distribution to specify groups of modules”

Group tests by their task with Test::More’s subtest()

In the earlier Item, Understand the Test Anywhere Protocal (TAP), you saw the very basics of that simple, line-oriented test report. You ran a single test and it output a single line to denote the status of the test, and possibly some diagnostic information. The TAP, however, didn’t organize any of the tests for you. Continue reading “Group tests by their task with Test::More’s subtest()”

Turn off autovivification when you don’t want it

Autovivification, although a great feature, might bite you when you don’t expect it. I explained this feature in Understand autovivification, but I didn’t tell you that there’s a way to control it and even turn it off completely. Continue reading “Turn off autovivification when you don’t want it”

Modify XML data with XML::Twig

If you need to deal with XML, first, we’re very sorry. Maybe you did something wrong if a previous life, such as munging XML with regular expressions. If you do better in this life, perhaps you won’t have to deal with XML in the next one. That right thing might be using XML::Twig, a powerful package for walking an XML tree, each part of which is a twig. For the rest of this Item, I’ll just call the module Twig. Continue reading “Modify XML data with XML::Twig”