You can specify literal hexadecimal floating-point numbers in v5.22, just as you can in C99, Java, Ruby, and other languages do. Perl, which uses doubles to store floating-point numbers, can represent a limited set of values. Up to now, you’ve had to specify those floating point numbers in decimal, hoping that a double could exactly represent that number. That hope, sometimes unfounded, is the basis for the common newbie question about floating point errors.
Before you get into the details, here’s the new feature. You can write a hexadecimal floating point number with hex digits. You specify the power of two exponent with p
, which must be present:
my $number = 0x1.aaaaaaaaaaaabp+1;
This allows you to represent a floating point number exactly because this representation maps almost directly onto the actual storage. If you can type it as a literal, you represent that number exactly to the limit of its precision. Exact numbers have no rounding errors.
Don’t get too excited though. Now you have exact values, but you still represent the same number of them that you already had. The decimal numbers that you couldn’t represent exactly are still inexact. It’s the other side of the coin. More on that in a moment.
You print numbers as hexadecimal floating point numbers in the same form with the new %a
specifier for sprintf:
printf '%a', $number; # 0x1.aaaaaaaaaaaabp+1 again
I write more about the actual storage in Using Inline::C to look at a number’s double representation.
Some snags
But, as with many new features, there’s a catch. Although you can specify a hexadecimal floating point literal, oct
and hex
don’t understand them (and probably never will):
my $number = hex( '0x1f.0p3' ); print "number is $number";
The number conversion gets as far as the decimal point, which it doesn’t think belongs there (as documented):
number is 31
The Scalar::Util module is similarly afflicted (not so clearly documented):
use Scalar::Util qw(looks_like_number); print 'Does not look like a number' unless looks_like_number("0x1f.0p3");
Does not look like a number
This happens because there are two places where perl parses numbers. There’s toke.c, which is the lexer that decides how perl parses the source. That handles the literals. Then there’s numeric.c‘s grok_number
that decides how to turn strings into numbers. That code doesn’t support hexadecimal floats or some other notations. perl only claims to handle hexadecimal floating point literals.
But, perl doesn’t claim to handle floating points with oct
or hex
, both which clearly state that they only handle non-negative integers. Even then, hex
wouldn’t handle these anyway because it’s oct
that handles that numbers that start with 0x
. The looks_like_number
case is tougher since it handles exponential notation but not hexadecimal notation. It would be nice if everything that interpreted numbers supported the same formats. This has pinched me a few times, but that’s my fault because perl does act as documented.
To convert a string into the hexadecimal floating point notation to a number, you can use a string eval
:
my $number = eval( '0x1f.0p3' ); print "number is $number";
Now you get the right number:
number is 248
Or maybe you’d feel better using Safe:
use Safe; # why these ops? No idea, but they all have to be there my @ops = qw( lineseq leaveeval const padsv padany rv2gv ); my $compartment = Safe->new; $compartment->permit_only( @ops ); my $result = $compartment->reval( '0x1fp3' ); say "result is $result";
But back to the reason you’d do this.
Rounding
You can’t represent all real numbers as a double; there’s a granularity that some of the numbers fall between. When you represent fractional numbers in decimal, they end up as bits that represent a power of two. For numbers without that many decimal places, it doesn’t matter. This code looks like it works fine:
my $e = 0.1; my $n = 0; foreach ( 1 .. 10 ) { $n += $e; printf "%f\n", $n; }
The output looks like it correctly adds 0.1
and ends up with exactly 1
:
0.100000 0.200000 0.300000 0.400000 0.500000 0.600000 0.700000 0.800000 0.900000 1.000000
You can show more decimal places:
my $e = 0.1; my $n = 0; foreach ( 1 .. 10 ) { $n += $e; printf "%.17f\n", $n; }
You see a bit of fuzziness at the end, although not enough to cause problems unless you’re using that many digits:
0.10000000000000001 0.20000000000000001 0.30000000000000004 0.40000000000000002 0.50000000000000000 0.59999999999999998 0.69999999999999996 0.79999999999999993 0.89999999999999991 0.99999999999999989
Do this addition enough times and it might matter:
my $e = 1/10; my $n = 0; foreach ( 1 .. 100_000_000 ) { $n += $e; printf "%.17f\n", $n unless $_ % 10_000_000; }
That seemingly insignificant error moves up the decimal places:
999999.99983897537458688 2000000.00071374792605639 3000000.00164507050067186 4000000.00257639307528734 4999999.99975590128451586 5999999.99603061098605394 6999999.99230532068759203 7999999.98858003038913012 8999999.98485474102199078 9999999.98112945072352886
If $e
was a number that could be represented exactly as a power of two in a finite number of bits, you don’t see this problem:
my $e = 1/8; # or 0.125 my $n = 0; foreach ( 1 .. 100_000_000 ) { $n += $e; printf "%.17f\n", $n unless $_ % 10_000_000; }
Since 1/8
is a power of two, you can represent it exactly as a power of two with no round off error even after millions of additions:
1250000.00000000000000000 2500000.00000000000000000 3750000.00000000000000000 5000000.00000000000000000 6250000.00000000000000000 7500000.00000000000000000 8750000.00000000000000000 10000000.00000000000000000 11250000.00000000000000000 12500000.00000000000000000
Instead of a division of a decimal number or a decimal literal, v5.22 now allows you to write that out as 0x0.2p0
:
my $e = 0x0.2p0; my $n = 0; foreach ( 1 .. 100_000_000 ) { $n += $e; printf "%.17f\n", $n unless $_ % 10_000_000; }
This isn’t something new since you could represent that number as a decimal, but in some cases it may be more convenient to specify it in hex. If you have an example of that convenience, let me know!
Things to remember
- v5.22 supports hexadecimal floating point literals (but not strings). The literal starts with
0x
and ends withpEXP
. - Hexadecimal floating-point literals specify exact numbers that won’t have round-off errors.
Why “%a”?
If I’m not mistaken, ‘a’ is the only letter in the phrase “hexadecimal floating point” that wasn’t already taken for a format conversion character in the C standard or GLIBC. (‘m’ is a GLIBC extension.)
:grin: