Wednesday, June 3, 2009

Parser Surprise

David Vydra found it surprising that he couldn't do the following:

scala> -1.toString
:1: error: ';' expected but '.' found.
-1.toString
^


I was surprised too by some similar example, from page 208 of Programming in Scala: A Comprehensive Step-by-step Guide.

Given that example, I would not expect that message in particular, but I would expect an error: that "unary_-" is not a member of java.lang.String, like this:

scala> -(1 toString)
:5: error: value unary_- is not a member of java.lang.String
-(1 toString)
^


Or, in fact, like this:

scala> +1.toString
:5: error: value unary_+ is not a member of java.lang.String
+1.toString
^


Some people have raised the question of operator precedence, claiming "-" ought to have more precedence than ".". It might seems reasonable in this example, where "1" is obviously an integer literal. But what about "-object.size"? Would you expect it to negate the object's size, or to negate the object and then return the size?

So let's take a look at Scala's operator precedence rule. The rules states that, with a sole excetion, the precedence, from the lowest to the highest, is based on an operator's first character as follow:

(all letters)
|
^
&
< >
= !
:
+ -
* / &
(all other special characters)

The exception is that an assignment has lower precedence than any other operator. Because of this excection, += -- an assignment -- has lower precedence than <=, the less-than-or-equal-to operator, even though operators starting with + have precedence over those starting with "<".

These, though, are the precedence rules for infix operators, and "-" in "-1" is not infix, but prefix. So, does infix precedence applies for prefix operators? No, as the example below shows:

scala> class X {
| def unary_- = {println("Unary -"); this}
| def *(b : X) = {println("operator *"); this}
| def -(b : X) = {println("operator -"); this}
| override def toString = "class X"
| }
defined class X

scala> new X
res16: X = class X

scala> new X
res17: X = class X

scala> -res16 * res17
Unary -
operator *
res18: X = class X

scala> res16 - res17 * res16
operator *
operator -
res19: X = class X


So we can see clearly that unary_- has precence over "*", which, itself, has precedence over "-".

So, back to ".", what is happening? Well, for one thing, "." is not an operator. Dot is considered an operator in some other languages. In Scala, though, it is a syntactic construct. If it were an operator, you would be able to define it in a class. As it is, it's a reserved symbol. So any operator will only be considered after resolving ".".

Still, again, why isn't the error as follow?

scala> -(1 toString)
:5: error: value unary_- is not a member of java.lang.String
-(1 toString)
^


Johannes Rudolph has since investigated this further. Here is his analysis:


Originally(*), and despite of the language specification (**),
negative number literals could not appear in a select expression
(-1.max(5)). Since [19950] which 'fixed' #2378 (***), the balance
between integer and floating point literals drifted: for with what the
parser was concerned, negative floating point literals received no
special treating any more, so -5f.max(2) was seen as
(5f.max(2)).unary_-, simple negative floating literals as 5f.unary_-
(*!*). For negative integer literals the situation stayed the way it
was the way before: -1.max(5) was simply not allowed by the parser.
(*) That's [3930], the commit with the unimpressive empty log message
where nsc sprang to life.
(**) The SLS would allow both 5.unary_- and the literal -5 and does at
least not specify how to resolve this ambiguity.
(***) The real cause of #2378 can be found in the backyard of the
backend: FJBG generates a DCONST_0 if a double literal == 0.0, what
-0.0 is.
(*!*) This behaviour caused #3486, where negative constants in
annotations wouldn't be possible any more.

2 comments:

  1. "The exception is that an assignment has lower precedence than any other operator. Because of this excection, += -- an assignment -- has precedence over <=, the less-than-or-equal-to operator, even though operators starting with + have precedence over those starting with "<"."

    There is something wrong with this sentence? Did you mean that += has a lower precedence than <=?

    ReplyDelete
  2. Indeed it was wrong. Thanks, I have corrected it.

    ReplyDelete