This document attempts to detail the interpolation behaviour of perl in double-quoted strings. It was originally written and tested using Perl 5.005_03, but I've just re-tested with 5.6.0.
There are a couple of error message differences (although I've left the old 5.005_03 errors because the differences are negligible) and one fairly alarming bug that has been introduced in 5.6.0 (clearly detailed in Ex 21.
It was born from a discussion on comp.lang.perl.misc. It was written by Jason King, a reasonably unexciting contributor to c.l.p.misc. If you have any questions after reading this document then don't mail me, I've only included my email address so people can report errors in this document or in the unlikely event that someone wants to say thanks (even in that case - keep it concise *8^).
Perhaps Randal L. Schwartz said it best when he said simply Perl doesn't interpolate expressions, just variables, we'll see as we go along how accurate this is. It's certainly a good place to start, we'll call it Theorem One (or T1 for short).
Let's look at the original example that started all this, posted by Alan Curry in c.l.p.misc
#!/usr/bin/perl -wl
use strict;
my @a=(1000, 10, 20, 30, 40, 50, 60);
my $i=2;
my $f=sub { return 4 };
my $r=\@a;
print "$i";
print "$a[0]";
print "@a";
print "$i+1";
print "$a[$i+1]";
print "$a[$i+1]+1";
print "$f->()+1";
print "$r->[1]";
print "$a[$f->()+1]+1";
print "@a[2..$f->()]";
__END__
I suggest that you copy this code and run it, have a look at the output before proceeding, see if you get the output that you expected. And if not then have a think about why not before going further.
We're going to go through each line and explain the interpolation that's happening. The first few will be boring and predictable - they're done for completeness and because I'm not making any assumptions about the Perl experience of the reader (not at this point anyway *8^).
print "$i"; # Output: 2
Simple interpolation, a variable is seen and its value is substituted. This conforms to T1.
print "$a[0]"; # Output: 1000
Slightly more complication than the above, but $a[0] is still
a variable, so interpolation is done and the value substituted. For those
that think of the [] as an index operator, Perl doesn't really see it
this way when it comes to strings. Perl treats $a[0] as a
single variable and so just interpolates it. This still conforms to
T1 therefore - because no expression evaluation was
done.
print "@a"; # Output: 1000 10 20 30 40 50 60
This is a simple one also, @a is clearly a variable, it's an
array. So it's interpolated and the results substituted.
T1 is still happy.
print "$i+1"; # Output: 2+1
From T1 we see that the variable interpolation is done,
but the expression is not evaluated. So the $i is
substituted, but the resulting string '2+1' is not evaluated any further.
print "$a[$i+1]"; # Output: 30
Here we see T1 break down. It said that expression
evaluation is not done, but in this example clearly the
2+1 expression is evaluated. It might be clear to some that
this is only done in the process of interpolating the variable, but to my
mind T1 needs a little amendment, so let's create
T2.
T2: The variables in a string are interpolated, but no operators are evaluated UNLESS the interpolation of a variable demands a value from that operator.
So does T2 cover the example at hand ? Perl sees the
$a[something] variable, but to index the @a
array it needs a value for 'something', so it evaluates $i+1.
And extremely importantly: this evaluation is not done in
string context, it is done as if that code was a normal line of
code. Then with that value it gets the array element and
substitutes that. T2 seems to work.
print "$a[$i+1]+1"; # Output: 30+1
Again, the $a[$i+1] part undergoes the same interpolation and
evaluation as above, but the resulting string '30+1' is not processed any
further, because by T2 we only evaluate operators IF a
variable interpolation demands it. So T2 works here too.
print "$f->()+1"; # Output: CODE(0x9f9b8c)->()+1
Ok, so here's a strange one. We often think of references as being the
value of the thing they reference, rather than values in their own right -
with that sort of thinking you'll get bitten by this example. $f
is a variable all unto itself, the dereference operator does not
need to be evaluated to get a value for $f. Therefore it
is not evaluated, the scalar value of $f is substituted
into the string and the '->()+1' string is appended for output.
T2 stands (although possible not the way we'd usually
prefer).
print "$r->[1]"; # Output: 10
But what about this one, if what we said in Ex 7 was
correct, then this should behave the same way. $r has a
value in a string context just like $f did in Ex 7, and it doesn't demand a value from the dereference
operator. So why is the dereference done ? This must contradict
T2. Well, it certainly does. This would be the exception
to that rule. You'll find that $hashref->{name} does the
same thing. So let's write that exception into our theorem to make
T3.
T3: The variables in a string are interpolated, but no operators are evaluated UNLESS the interpolation of a variable demands a value from that operator OR the operator is a dereference operator and is followed by either a '[' or a '{'.
Does that sound like an ugly exception ? Well I'm afraid it's true. We'll see that later in "How tightly do '[' and '{' bind ?", but for now let's test our T3 with the current example.
The dereference operator is followed by a '[' so the operator is evaluated, and the result substituted. So T3 is happy.
And before we move on, let's just double check Ex 7 (which also used the dereference operator) with T3. The dereference operator in that example was not followed by '[' or '{' so no evaluation was done. T3 is still happy.
print "$a[$f->()+1]+1"; # Output: 50+1
Doesn't this look messy ? T3 doesn't think so.
$a[something] is a variable that needs a value to complete
interpolation. So $f->()+1 is evaluated. This gives us a
value of 5, which we use to index the @a array to get a value
that's then substituted into the string to give us '50+1', no further
variable interpolation exists, so no further evaluation is required.
T3 is happy.
print "@a[2..$f->()]"; # Output: 20 30 40
Again, a walk in the park for T3.
@a[something] is an array slice and needs to have
2..$f->() evaluated before interpolation can be complete.
The resulting value (2,3,4) is used to take the array slice.
T3 is yawningly happy.
So it would appear that we've got a good theorem to use for predicting Perl's interpolation. Let's have a look at some more examples, also posted by Alan Curry (he's such a troublemaker *8^).
Again, I strongly urge you to copy-paste these examples into your favourite editor and run them before proceeding. It adds a lot of value to this discussion if you have already run the examples, and therefore have an expectation about the output of each one.
my %h=(foo=>'bar', bar=>'baz');
my $s='foo';
print "$h{'foo'}";
print "$h{$s}";
print "$h{$h{$s}}";
print "$h{\"$h{$s}\"}";
print "$h{'foo'}"; # Output: bar
Hash elements (just like array elements) are treated as variables in their
own right. And the same rules apply as do those for array elements. So by
T3 (in fact by T2 and
T1) the above element is looked up in the
%h hash and substituted into the string. No surprises.
print "$h{$s}"; # Output: bar
Here Perl sees $h{something} and knows that to find the hash
element it must evaluate $s, it does that and gets
$h{foo} which it looks up in %h and substitutes into
the string. T3 is happy.
print "$h{$h{$s}}"; # Output: baz
Again, Perl sees $h{something} and therefore has to evaluate
$h{$s}. It does this and then has $h{bar} which
it substitutes into the string. T3 is not at all
challenged by this.
print "$h{\"$h{$s}\"}"; # Output: baz
A little more complex, here Perl sees $h{something} and
needs to evaluate "$h{$s}" (notice that this code
actually contains the quotes) before it can substitute the variable. Of
course, in Perl a string is a valid piece of code and it will evaluate it.
Therefore Perl evaluates the string "$h{$s}",
which clearly requires more interpolation, it interpolates
$h{$s} and gets bar which it then uses to evaluate
$h{foo} to get baz which it then substitutes
into the original string.
We'll mention it again here, that after the original variable
$h{something} is seen in the original string, Perl then sets
about evaluating 'something' in a non-string context. Perl needs a value
for this 'something' before it can complete interpolation. In Ex 14 this 'something' happened to be another
double-quoted string (a double-quoted string happens to be a valid
statement in Perl), so it was evaluated and a second level of interpolation
occurred. But this second interpolation had nothing to do with the first -
it only occured because Perl was looking for a value for 'something' and
so evaluated 'something', which happened to be a string.
Just in case you don't understand what I mean by the second level of quotation and interpolation, let's look at another example which uses single quotes instead of double quotes.
print "$h{'$h{$s}'}"; # Output: Use of uninitialized value at - line 28.
If you run this code then you'll get a warning of Use of uninitialized value at - line %d., this is because Perl did the following.
First it sees $h{something} and so evaluates the contents of
'something', which are '$h{$s}'. Now you'll remember me
saying that this evaluation is done in a non-string context. Well, here it
will be very obvious. The expression '$h{$s}' (note the
single quotes, they're part of the expression) is not interpolated, so it
literally has the value of the string $h{$s} which Perl then
attempts to lookup in the %h hash. As the message says, it
can't find it - so we get a warning about using an uninitialised value.
This is in stark contrast to the previous example where there was a double-quoted string that was interpolated further.
You'll remember that when we wrote down T3 that we made the rather ugly exception concerning a dereference operator followed by either a '[' or a '{'. Here we're going to revisit that, and have a look at a few examples that will show you that this is in fact how Perl behaves.
The first examples that we're going to look at will not be using the dereference operator. We want to first see how tightly the '[' and '{' bind to a variable name. This is an important point to understand for anyone who thinks of '[' or '{' as an operator - they really are not treated as such - they're absent from the perlop manual for a reason.
Some of the examples below intentionally cause syntax errors, so we haven't listed them as a working program, just run them one at a time. With each example we're showing both the '[' version and the '{' version. All the examples assume the following simple declaration:
#!/usr/bin/perl -wl # leave out the 'use strict' this time my $a = 4; # a simple scalar
print "$a+5"; # Output: 4+5
Nothing amazing about the results of this, as we've seen with T2 and T3 the + operator is
not evaluated unless it's needed for a variable interpolation - here it
clearly isn't so it's just part of the string.
print "$a[0]"; # Output: Use of uninitialized value at - line 2.
print "$a{name}"; # Output: Use of uninitialized value at - line 2.
What's going on here ? You'll get a warning (from the '-w') of Use
of uninitialized value at - line %d. indicating that Perl's not
using your $a but instead it's trying to evaluate the list
member $a[0].
What does this mean ? It means that Perl binds very
tightly to the '[' and '{' tokens. It doesn't do any checking of the
symbol table, doesn't care that it skipped past the valid variable
$a, it just sees the '[' or '{' and tries to evaluate the list
member.
print "$a["; # Output: Missing right bracket at - line 2, within string
print "$a{"; # Output: Missing right bracket at - line 2, within string
You'd expect Perl at least to know what you're talking about here ! You haven't even included a full list element syntax. But Perl is blind to this, all it sees is the '[' or '{', and it's then looking for the list member. Believe it or not, you have to go to the following lengths to get the output we're after here.
print "$a->["; # Output: Missing right bracket at - line 2, within string
print "$a->{"; # Output: Missing right bracket at - line 2, within string
Same deal with dereferenced list members. Basically if Perl sees that damn '[' or '{' anywhere in the string then it jumps on it and begins evaluation, even when it has a perfectly good variable to work with.
print "$a" . "["; # Output: 4[
print "$a" . "{"; # Output: 4{
Here Perl performs the interpolation before it evaluates
the . (string concatenation) operator, so by the time Perl
sees the '[' or the '{' it can't do anything with it, and finally we get
our output.
print "${a}["; # 5.005_03 Output: 4[
# 5.6.0 Output: Name "main::a" used only once: possible typo at - line 3.
Use of uninitialized value in concatenation (.) at - line 3.
[
print "${a}{"; # 5.005_03 Output: 4{
# 5.6.0 Output: Name "main::a" used only once: possible typo at - line 3.
Use of uninitialized value in concatenation (.) at - line 3.
{
In 5.005_03 you can see that we had another way to tell perl that
$a is it's own variable and should not be confused with the
'[' or '{'. But in 5.6.0 the parenthesis seem to force perl to evaluate
$a as a package variable and hence there's a warning and
an error.
That's all really, hope you understand a bit more about variable interpolation in strings.
Copyright © 1999 Jason King. All rights reserved.