You can use several different Perl modules to inspect data structures. Many of these modules, however, are really two tools in one. Besides showing a data structure as a string, they also serialize the data as Perl code so you can reconstruct the data structure. That second job often makes things hard for you. If you don’t need the serialization job, don’t use a module that insists on it.
The Data::Dumper module is popular because it comes with Perl. Here’s a program that we’ll use for the rest of the Item, save for changes to the module dumping the structure:
use Data::Dumper qw(Dumper); use DateTime; use HTTP::Request; my $request = HTTP::Request->new( GET => 'http://www.perl.org', ); $request->header( 'X-Perl' => '5.12.2' ); $request->header( 'Cat' => 'Buster' ); my $data = { hash => { cat => 'Buster', dog => 'Addy', bird => 'Poppy', }, array => [ qw( a b c ) ], datetime => DateTime->now, reqeust => $request, }; print Dumper( $data );
The output is a Perl data structure, suitable for eval
. That makes it a bit verbose and ugly:
$VAR1 = { 'array' => [ 'a', 'b', 'c' ], 'hash' => { 'cat' => 'Buster', 'dog' => 'Addy', 'bird' => 'Poppy' }, 'reqeust' => bless( { '_content' => '', '_uri' => bless( do{\(my $o = 'http://www.perl.org')}, 'URI::http' ), '_headers' => bless( { 'cat' => 'Buster', 'x-perl' => '5.12.2' }, 'HTTP::Headers' ), '_method' => 'GET' }, 'HTTP::Request' ), 'datetime' => bless( { 'local_rd_secs' => 68540, 'local_rd_days' => 734452, 'rd_nanosecs' => 0, 'locale' => bless( { 'default_time_format_length' => 'medium', 'native_territory' => 'United States', 'native_language' => 'English', 'native_complete_name' => 'English United States', 'en_language' => 'English', 'id' => 'en_US', 'default_date_format_length' => 'medium', 'en_complete_name' => 'English United States', 'en_territory' => 'United States' }, 'DateTime::Locale::en_US' ), 'local_c' => { 'hour' => 19, 'second' => 20, 'month' => 11, 'quarter' => 4, 'day_of_year' => 315, 'day_of_quarter' => 42, 'minute' => 2, 'day' => 11, 'day_of_week' => 5, 'year' => 2011 }, 'utc_rd_secs' => 68540, 'formatter' => undef, 'tz' => bless( { 'name' => 'UTC' }, 'DateTime::TimeZone::UTC' ), 'utc_year' => 2012, 'utc_rd_days' => 734452, 'offset_modifier' => 0 }, 'DateTime' ) };
The Data::Dump also serializes, and is a cleaner Data::Dumper. In void context, it automatically prints for you:
use Data::Dump qw(pp); ...; # same $data thing as before pp( $data );
The output looks a lot like the Data::Dumper output because it has to be a Perl data structure:
{ array => ["a", "b", "c"], datetime => bless({ formatter => undef, local_c => { day => 11, day_of_quarter => 42, day_of_week => 5, day_of_year => 315, hour => 19, minute => 18, month => 11, quarter => 4, second => 33, year => 2011, }, local_rd_days => 734452, local_rd_secs => 69513, locale => bless({ default_date_format_length => "medium", default_time_format_length => "medium", en_complete_name => "English United States", en_language => "English", en_territory => "United States", id => "en_US", native_complete_name => "English United States", native_language => "English", native_territory => "United States", }, "DateTime::Locale::en_US"), offset_modifier => 0, rd_nanosecs => 0, tz => bless({ name => "UTC" }, "DateTime::TimeZone::UTC"), utc_rd_days => 734452, utc_rd_secs => 69513, utc_year => 2012, }, "DateTime"), hash => { bird => "Poppy", cat => "Buster", dog => "Addy" }, reqeust => bless({ _content => "", _headers => bless({ "cat" => "Buster", "x-perl" => "5.12.2" }, "HTTP::Headers"), _method => "GET", _uri => bless(do{\(my $o = "http://www.perl.org")}, "URI::http"), }, "HTTP::Request"), }
Steven Haryanto added a filter feature to an existing interface, which you can see in Use Data::Dump filters for nicer pretty-printing. You can get better control of the parts you’d want to distill, such as that DateTime:
{ array => # some items hidden [2011-02-03, "d", "...", "n"], datetime => 2011-02-03, hash => { bird => "Poppy", cat => "Buster", dog => "Addy" }, }
If you forget about the serialization, though, you can do much better. Often, you want to inspect a data structure to see what’s on the inside without saving it for future use. If that’s the case, you don’t need to see the data structure as Perl code and the pretty printer can organizer the data much better and provide more information. The Data::Printer module doesn’t care at all about serialization. In void context, its p
automatically prints:
use Data::Printer; ...; # same $data thing as before p( $data );
The output is as verbose, but it’s also much more dense. When it prints an object, it shows you the methods in the class:
\ { array [ [0] "a", [1] "b", [2] "c" ], datetime DateTime { public methods (134) : add, add_duration, am_or_pm, bootstrap, ce_year, christian_era, clone, compare, compare_ignore_floating, date, datetime, day, day_abbr, day_name, day_of_month, day_of_month_0, day_of_quarter, day_of_quarter_0, day_of_week, day_of_week_0, day_of_year, day_of_year_0, day_0, DefaultLanguage, DefaultLocale, delta_days, delta_md, delta_ms, dmy, doq, doq_0, dow, dow_0, doy, doy_0, duration_class, epoch, era, era_abbr, era_name, format_cldr, formatter, fractional_second, from_day_of_year, from_epoch, from_object, hires_epoch, hms, hour, hour_1, hour_12, hour_12_0, INFINITY, is_dst, is_finite, is_infinite, is_leap_year, iso8601, jd, language, last_day_of_month, leap_seconds, local_day_of_week, local_rd_as_seconds, local_rd_values, locale, MAX_NANOSECONDS, mday, mday_0, mdy, microsecond, millisecond, min, minute, mjd, mon, mon_0, month, month_abbr, month_name, month_0, NAN, nanosecond, NEG_INFINITY, new, now, offset, quarter, quarter_abbr, quarter_name, quarter_0, sec, second, SECONDS_PER_DAY, secular_era, set, set_day, set_formatter, set_hour, set_locale, set_minute, set_month, set_nanosecond, set_second, set_time_zone, set_year, STORABLE_freeze, STORABLE_thaw, strftime, subtract, subtract_datetime, subtract_datetime_absolute, subtract_duration, time, time_zone, time_zone_long_name, time_zone_short_name, today, truncate, utc_rd_as_seconds, utc_rd_values, utc_year, wday, wday_0, week, week_number, week_of_month, week_year, weekday_of_month, year, year_with_christian_era, year_with_era, year_with_secular_era, ymd private methods (38) : _accumulated_leap_seconds, _add_overload, _adjust_for_positive_difference, _calc_local_components, _calc_local_rd, _calc_utc_components, _calc_utc_rd, _cldr_pattern, _compare, _compare_overload, _day_has_leap_second, _day_length, _era_index, _format_nanosecs, _handle_offset_modifier, _is_leap_year, _month_length, _new, _new_from_self, _normalize_leap_seconds, _normalize_nanoseconds, _normalize_seconds, _normalize_tai_seconds, _offset_for_local_datetime, _rd2ymd, _seconds_as_components, _space_padded_string, _string_compare_overload, _string_equals_overload, _string_not_equals_overload, _stringify, _subtract_overload, _time_as_seconds, _utc_hms, _utc_ymd, _weeks_in_year, _ymd2rd, _zero_padded_number internals: { formatter undef, local_c { day 11, day_of_quarter 42, day_of_week 5, day_of_year 315, hour 19, minute 41, month 11, quarter 4, second 42, year 2011 }, local_rd_days 734452, local_rd_secs 70902, locale DateTime::Locale::en_US, offset_modifier 0, rd_nanosecs 0, tz DateTime::TimeZone::UTC, utc_rd_days 734452, utc_rd_secs 70902, utc_year 2012 } }, hash { bird "Poppy", cat "Buster", dog "Addy" }, request HTTP::Request { Parents HTTP::Message Linear @ISA HTTP::Request, HTTP::Message public methods (10) : accept_decodable, as_string, clone, dump, method, new, parse, uri, uri_canonical, url private methods (0) internals: { _content "", _headers HTTP::Headers, _method "GET", _uri URI::http } } }
You probably don’t want to see all that internal gunk from DateTime or HTTP::Request, so you can set filters from them to print them however you like:
use Data::Printer { filters => { 'DateTime' => sub { "DateTime => $_[0]" }, 'HTTP::Request' => sub { "URL => " . $_[0]->uri }, }, }; ...; # same $data thing as before p( $data );
Now you can see what you need to see much easier:
\ { array [ [0] "a", [1] "b", [2] "c" ], datetime DateTime => 2011-11-11T19:51:06, hash { bird "Poppy", cat "Buster", dog "Addy" }, request URL => http://www.perl.org }
So far, you’ve changed the import list to specify what you wanted, but you can change it each time that you want to dump something:
p( $data, { index => 0 } );
Now you don’t have array indices:
\ { array [ "a", "b", "c" ], datetime DateTime => 2011-11-11T20:25:22, hash { bird "Poppy", cat "Buster", dog "Addy" }, request URL => http://www.perl.org }
You can make colorized output too by setting another property:
use Data::Printer { colored => 1, filters => { 'DateTime' => sub { "DateTime => $_[0]" }, 'HTTP::Request' => sub { "URL => " . $_[0]->uri }, }, }; ...; # same $data thing as before p( $data );
And you can change the colors if you don’t like the default set. You have to choose a valid Term::ANSI color:
use Data::Printer { colored => 1, color => { array => 'yellow', string => 'cyan', hash => 'green', }, filters => { 'DateTime' => sub { "DateTime => $_[0]" }, 'HTTP::Request' => sub { "URL => " . $_[0]->uri }, }, }; ...; # same $data thing as before p( $data );
This might be more pleasing to you:
Lastly, one of the most annoying “features” of a pretty printer is the constant reference passing. Since Perl flattens its argument list into a single list, to maintain data structure identities, you have to pass them as a reference:
use Data::Dumper; print Dumper( \%hash );
use Data::Dump; pp( \%hash );
You can do that with Data::Printer too:
use Data::Printer; p( \%hash );
Data::Printer uses prototypes to make that easier for you. The Dumper
and pp
each dump a list of structures, but Data::Printer‘s p
dumps exactly one structure. As such, it can use prototypes to recognize a whole hash or array as the first argument:
use Data::Printer; p( %hash );
This feature has a few oddities, but Breno explains them in the documentation.
Things to remember
- Serialization and inspection are different tasks
- Most Perl pretty printers try to serialize
- The Data::Printer