Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Improve overload resolution #20054

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

EugeneFlesselle
Copy link
Contributor

@EugeneFlesselle EugeneFlesselle commented Mar 31, 2024

resolveOverloaded1 starts with narrowMostSpecific(candidates) which compares the alternatives based on the 1st argument list. If more than one result if found, we can continue with the 2nd argument list. But we do so with the original candidates, rather than with the result of narrowMostSpecific. This can lead to choosing an alternative which is not the most specific, by now disregarding the 1st argument list.

Fixes #20053, the 1st pass correctly eliminated the 3rd def ^^^, but could not resolve between the 1st two (having the same 1st argument list). The 2nd pass then disregarded this and restarted the comparison based on the 2nd argument list alone, which yielded the 3rd option.

The change is just to use the initial result of narrowMostSpecific in the recursive resolution based on subsequent argument lists.

EDIT: as of f40c609 the changes in overload resolution apply starting from 3.7 and are reported in warnings in 3.6.

@EugeneFlesselle EugeneFlesselle changed the title Improve overload resolution [WIP] Improve overload resolution Mar 31, 2024
odersky added a commit to dotty-staging/dotty that referenced this pull request Apr 3, 2024
We sometimes have two alternatives a.m and b.m with the same symbol
but different prefixes. Previously these would always be ambiguous.
We now try to disambiguate this so that the alternative with the
more specific prefix wins. To determine this, we widen prefixes
also going from module classes to their parents and then compare
the resulting types.

This might fix a problem in ScalaTest that popped up after scala#20054.
odersky added a commit that referenced this pull request Apr 8, 2024
We sometimes have two alternatives a.m and b.m with the same symbol and
type but different prefixes. Previously these would always be ambiguous.
We now try to disambiguate this so that the alternative with the more
specific prefix wins. To determine this, we widen prefixes also going
from module classes to their parents and then compare the resulting
types.

This might fix a problem in ScalaTest that popped up after #20054.
@Gedochao Gedochao requested a review from sjrd June 10, 2024 14:19
@sjrd
Copy link
Member

sjrd commented Jun 28, 2024

Phew, I got it minimized to

class TestBody1
class TestBody2

class StartWithWord
class EndWithWord

class Matchers:
  extension (leftSideString: String)
    def should(body: TestBody1): Unit = ()
    def should(body: TestBody2): Unit = ()

  extension [T](leftSideValue: T)
    def should(word: StartWithWord)(using T <:< String): Unit = ()
    def should(word: EndWithWord)(using T <:< String): Unit = ()

  def endWith(rightSideString: String): EndWithWord = new EndWithWord

class Test extends Matchers:
  def test(): Unit =
    "hello world" should endWith ("world")

Before the change in this PR, the above code compiles. After, it fails to compile with

-- [E008] Not Found Error: tests/pos/i20053c.scala:20:18 -----------------------
20 |    "hello world" should endWith ("world")
   |    ^^^^^^^^^^^^^^^^^^^^
   |value should is not a member of String.
   |An extension method was tried, but could not be fully constructed:
   |
   |    this.should()
   |
   |    failed with:
   |
   |        value should: <overloaded Test.this.should> does not take parameters
1 error found

It's not exactly the same error that we find in Scalactic, but I'm pretty sure it has the same root cause.

As we can see, there are 4 initial candidates for should. Based only on the first parameter list, the ones with leftSideString: String are more specific than the ones with leftSideValue: T. Since there are 2 of them, however, resolution proceeds to the second parameter list.

  • Before this PR, all four candidates are carried to the second round, where overload resolution finds that the only applicable one (never mind specificity) is the one taking an EndWithWord. So resolution succeeds.
  • After this PR, only the two candidates with leftSideString: String are carried. But then overload resolution finds no applicable alternative that accepts an EndWithWord, so compilation fails.

This example shows that the changes proposed here, as is, are not valid. They definitely break existing code.

It might be possible to refine the logic further, in a non-local way: keep resolving based on further parameter lists with all candidates, but after all param lists have been exhausted, look back at the combined specificity or something. Not sure we want to go there.

@EugeneFlesselle
Copy link
Contributor Author

EugeneFlesselle commented Jun 28, 2024

Great minimization ! On a very minor note, the (using T <:< String) can also be removed.

This is really just restating the same thing, but also an interesting example:

object A:
  def f(x: Any)(y: Double) = ??? // alt1
  def f(x: Int)(y: String) = ??? // alt2
  def f(x: Int)(y: Boolean) = ??? // alt3
  f(3)(4d) // ok, resolves to alt1

object B: // same as A but without alt3
  def f(x: Any)(y: Double) = ??? // alt1
  def f(x: Int)(y: String) = ??? // alt2
  f(3)(4d) // error, even though alt1 is the only applicable alternative

@soronpo
Copy link
Contributor

soronpo commented Jun 28, 2024

This is really just restating the same thing, but also an interesting example:

Actually, this could be seen as expected behavior, and signatures for overloading should be matched block-by-block.
Mainly because later block signatures can be type dependent on earlier blocks.
def f[T <: Foo](x: T)(y: x.Dep) = ???

`resolveOverloaded1` starts with `narrowMostSpecific(candidates)`
which compares the alternatives based on the 1st argument list.
If more than one result if found, we can continue with the 2nd argument list.
But we do so with the original candidates, rather than with the result of `narrowMostSpecific`.
This can lead to choosing an alternative which is not the most specific,
by now disregarding the 1st argument list.

In i120053, the 1st pass correctly eliminated the 3rd `def ^^^`,
but could not resolve between the 1st two (having the same argument list).
The 2nd pass then disregarded this and restarted the comparison
based on the 2nd argument list alone, which incorrectly yielded the 3rd option.

The change is simply using the initial result of `narrowMostSpecific` in the recursive resolution based on subsequent argument lists.
I'm not sure however if the same changes should apply to the rest of the cases attempting further narrowing ?
…in 3.6

This is similar to the warnings for changes in given preference.
In this case, however, the changes only affect a part of disambiguation used
relatively late in the process, as a "last resort" disambiguation mechanism.
We can therefore accept running resolution independently with both schemes
in these cases to detect and report changes.

Having a source position for the warning messages requires passing the tree
source position to resolveOverloaded from the Typer.
It could previously be avoided this since any potential error in overloading
could be determined from its result. Clearly, this cannot be done for the new
warnings, although I am open to an alternative design.
Note that in tests/neg/multiparamlist-overload-3.7 Test2 we now
get an error in both Parts as intended. But they are, however, different ones:
Part1 is a TypeMismatch Error, whereas Part2 is a NoMatchingOverload Error.

This is due to the fact that `resolveOverloaded` will first
find the candidate alternatives by considering only the 1st parameter list
and commit early if there is a single one, e.g. Test2.Part1.
If not, we recursively continue with the found alternatives and
recompute the candidate alternatives based on the 2nd parameter list,
which may rule out all of them, and hence lead to a different message.
@EugeneFlesselle
Copy link
Contributor Author

This would be another PR worth testing the impact of on the community build. @WojciechMazur could you run it based on the changes here ?

I already expect numerous errors caused by the scalatest should overload set, which would hopefully be fixed by a change in scalatest. It'd be great to have an idea of whether this is the only major source of impacted code.

@WojciechMazur
Copy link
Contributor

WojciechMazur commented Aug 21, 2024

After testing 1590 projects we have 5 new regressions when compared with 3.6.0-RC1-bin-20240819-b64afad-NIGHTLY for which 29 projects were already failing.
Tested using commit f40c609

Side note: there is a large number of projects that are now using the -source:3.5-migration or 3.4-migration version (~100). In most of these, it should no longer be needed because we can no perform automatic rewrites to replace usages of syntax that fails to compiler since 3.6. I'll retry the build after resolving this issue.

Project Version Build URL Notes
fenixedu/fenixedu-scala-sdk 0.5.0 Open CB logs
fs2-blobstore/fs2-blobstore 0.9.14 Open CB logs
ist-dsi/scala-keystone-client 0.12.2 Open CB logs
ist-dsi/scala-nova-client 0.10.1 Open CB logs
laserdisc-io/mysql-binlog-stream 3.0.0 Open CB logs

I can see there are also migration warnings for the overloads. I'm collecting statistics about these warnings right now.

#Edit: There are no warnings because projects were compiled using -source:3.6-migration. The statistics need to way until a next run with forced -source:3.6 or -source:3.7-migration setting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overloading or implicit bug in 3.4.1
4 participants