Skip to content

Overloading: always prefer non-varargs to varargs #9718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

smarter
Copy link
Member

@smarter smarter commented Sep 3, 2020

We recently merged #9601 which unified our handling of Object coming
from Java methods, an unintended consequence of that change is that some
existing Java APIs can no longer be called without running into
ambiguity errors, for example log4j defines two overloads for
Logger.error:

(x: String, y: Object): Unit
(x: String, y: Object*): Unit

previously we translated Object to Any but left Object* alone, now
they're both treated the same way (translated to a special alias of
Object) and so neither method ends up being more specific than the
other, so error("foo: {}, 1) is now ambiguous.

Clearly the problem lies with how we handle varargs in overloading
resolution, but this has been a source of issues for years with no clear
resolution (hah!):

This PR cuts the Gordian knot by simply declaring that non-varargs
methods are always more specific than varargs. This has several
advantages:

The downside is that it doesn't match what Scala 2 does, but our current
behavior isn't a perfect match either (also it seems that Scala 2
handles Java varargs and Scala varargs differently in overloading
resolution which is another source of complexity best avoided, see
tests/run/overload_repeated).

Fixes #9688, supercedes #6230.

We recently merged scala#9601 which unified our handling of `Object` coming
from Java methods, an unintended consequence of that change is that some
existing Java APIs can no longer be called without running into
ambiguity errors, for example log4j defines two overloads for
`Logger.error`:

    (x: String, y: Object): Unit
    (x: String, y: Object*): Unit

previously we translated `Object` to `Any` but left `Object*` alone, now
they're both treated the same way (translated to a special alias of
`Object`) and so neither method ends up being more specific than the
other, so `error("foo: {}, 1)` is now ambiguous.

Clearly the problem lies with how we handle varargs in overloading
resolution, but this has been a source of issues for years with no clear
resolution:
- scala/bug#8342
- scala/bug#4728
- scala/bug#8344
- scala#6230

This PR cuts the Gordian knot by simply declaring that non-varargs
methods are always more specific than varargs. This has several
advantages:
- It's an easy rule to remember
- It matches what Java does (see
  https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2)
- It avoids unnecessary wrapping of arguments

The downside is that it doesn't match what Scala 2 does, but our current
behavior isn't a perfect match either (also it seems that Scala 2
handles Java varargs and Scala varargs differently in overloading
resolution which is another source of complexity best avoided, see
`tests/run/overload_repeated`).

Fixes scala#9688, supercedes scala#6230.
@smarter smarter mentioned this pull request Sep 3, 2020
@odersky
Copy link
Contributor

odersky commented Sep 3, 2020

This would have to be added to the spec. At first glance, I don't like it. Overloading resolution rules are already quite complex; every special rule we add makes the situation worse. The previous overloading rules worked just with applicability. As a consequence: f(x: T) would be more specific that f(x: T*) since a T* formal can accept a T argument but not vice-versa. That's all as intended and does not require a special rule. The only problem is with f(x: Object) vs f(x: Object*). That would be ambiguous since we also have that a Object* (which is essentially Seq[Object]) can be passed to a Object formal.

I think a more conservative and simpler fix would be to change this last rule. I.e. declare that for the purposes of applicability checking, a varargs parameter cannot be passed to a single parameter, even if that single parameter has type Object or Any. That should work, but it still requires careful study.

But in any case, never change the overloading algorithm without changing the spec. We are very close to doom here already; any further discrepancy could be irrecoverable.

@odersky
Copy link
Contributor

odersky commented Sep 3, 2020

Another reason not to go down this path is that it would prefer the first alternative here:

def f(x: Any) = 1
def f(x: String*) = 2
f("hello")

I think that's wrong. By contrast, my proposed rule would give an ambiguity in this case which seems safer.

@smarter
Copy link
Member Author

smarter commented Sep 3, 2020

I think that's wrong

I don't have a strong opinion here, but I'll note that Java always prefer the non-varargs, so if we want to do something different without potentially breaking Java APIs we need to special-case Java varargs somehow (Scala 2 seems to do that).

But in any case, never change the overloading algorithm without changing the spec.

Unfortunately we don't have an editable version of the spec for Scala 3 changes in this repo.

@odersky
Copy link
Contributor

odersky commented Sep 3, 2020

I don't have a strong opinion here, but I'll note that Java always prefer the non-varargs, so if we want to do something different without potentially breaking Java APIs we need to special-case Java varargs somehow (Scala 2 seems to do that).

I believe Java might already have jumped the shark here. Last time I looked, their overloading rules were a complete mess. But that should not be an excuse to do the same for Scala.

Unfortunately we don't have an editable version of the spec for Scala 3 changes in this repo.

Then we need to create a "changes to overloading resolution page in references that explains the old spec and how it changes. Similar to the "changes in implicit resolution" page. Maybe it can even be combined, since both are related.

@odersky
Copy link
Contributor

odersky commented Sep 4, 2020

Here's a systematic chart of the behaviour under the different scenarios. Each column gives the parameter of the alternative that is picked.

                 Status quo     this PR    Alternative
T/T*              T             T            T
Object/T*         T*            Object       ambiguous
Object/Object*    ambiguous     Object       Object  
T/Object*         T             T            T

This PR removes all ambiguities, but it changes the run-time semantics for line 2, and I believe it picks the non-intuitive alternative in that case.

I believe it's safer to make line 2 ambiguous. Note that this would not pose a fundamental problem for application code, since
one can always upcast the argument to Object to disambiguate. Or, if one wants the T* alternative, pass it as
List(x): _*.

@abgruszecki
Copy link
Contributor

abgruszecki commented Sep 4, 2020

If I have all of my facts correct, I think this is the best solution to the problem.

Here's a summary. In #6230 I suggested that the fix is to make repeated params not a subtype of any other type (#6230 (comment)). Then it came up that this is equivalent to always preferring non-varargs to varargs (#6230 (comment)). This makes sense to me, because I see having repeated params as a property of the parameter list, not of the type of the argument.

The primary problem with the spec is that when defining specificity of overload alternatives (https://www.scala-lang.org/files/archive/spec/2.12/06-expressions.html#overloading-resolution), it talks about one alternative being applicable to the types of parameters of another alternative, but it never mentions how to understand the type of a repeated parameter. The fix I'd propose is to:

  1. First, try to select the most specific alternative that doesn't have a repeated parameter
  2. Second, try to select the most specific alternative that has a repeated parameter,
    by treating the type of a repeated parameter T* as T

Again, this is in line with understanding repeated params as a property of the parameter list. This also avoids problem with arriving at constraints such as α >: T* (see second part of #6230 (comment)).

Finally, I think that we should always prefer an alternative without repeated params, because it's always possible to opt into the other alternative by marking the last argument as repeated, while the reverse is not true. Consider:

trait C:
  def foo[T](ts: T*)
  def foo(any: Any)

If we treat applications of C#foo as ambiguous, then I don't see any way to explicitly select the second alternative.

This is also in line with how Scala 2 and Java treat overloading resolution (Scala 2 selects the second alternative in the above example).

@odersky
Copy link
Contributor

odersky commented Sep 4, 2020

The fix I'd propose is to:

  1. First, try to select the most specific alternative that doesn't have a repeated parameter
  2. Second, try to select the most specific alternative that has a repeated parameter,
    by treating the type of a repeated parameter T* as T

The problem is that backtracking like this leads to exponential explosions of complexity.

@abgruszecki
Copy link
Contributor

Right, I worded myself poorly. What I meant is that when comparing two overloading alternatives for specificity:

  1. If both don't have repeated params, we treat them as we do now
  2. If only one has repeated params, that one is more specific
  3. If both have repeated params, we compare them as we do now with the caveat that we treat the type of T* as T

Alternatively (and, I think, equivalently), we could specify that when converting function parameter to arguments to check for specificity, we treat a repeated parameter p_n: T_n* as an argument of the form p_n : _*, where the type of p_n is Seq[T_n]. That might be an ever smaller adjustment, though possibly less clear on the intent.

@odersky odersky mentioned this pull request Sep 5, 2020
@odersky
Copy link
Contributor

odersky commented Sep 5, 2020

I have opened another PR which I believe fixes the problems but requires fewer changes. It implements exactly these rules:

If both don't have repeated params, we treat them as we do now
If only one has repeated params, that one is more specific
If both have repeated params, we compare them as we do now with the caveat that we treat the type of T* as T

smarter pushed a commit to dotty-staging/dotty that referenced this pull request Sep 7, 2020
Refine/clarify the rule when a method is applicable to a vararg parameter list.

Test cases adapted from scala#9718 which used a different approach.
@odersky odersky closed this Sep 7, 2020
@smarter
Copy link
Member Author

smarter commented Sep 7, 2020

Superceded by #9732.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ambiguous overload between varargs and non-varargs
3 participants