Skip to content

Fix #4867 Union type inference is too eager to widen unions #7829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 51 commits into from

Conversation

landerlo
Copy link

@landerlo landerlo commented Dec 20, 2019

This inference enhancement for union inference follows @smarter 's suggested approach of preserving the union if there is an explicit union ascription. Otherwise the union is widened with OrType.join.

Uses the very naive but effective approach of lexically checking the untyped tree for Or operators or TypedSplices when the Typer is widening the unions.

The fix only touches that widening point to introduce the lexical check. If the Or operator is found on the type tree of the term, or the DefDef type tree is a TypedSplice, this is seen as a hint of an explicit union ascription, both as a direct union ascription or a union ascription in a Function signature. This heuristic works well for the case in @Baccata 's code and other union type inference issues in my own exploration.

IMO the non-rigorous approach is justified as this is a very targeted check that simply defers the widening of the union type. In the worst case, the union would be joined at a later time. I couldn't foresee any negative impacts by not joining the OrType as this check is only performed when the typer has detected the OrType and calls widenUnion. In case some ascriptions are missed by this check this would only mean the join is performed eagerly keeping the current behaviour.

We get a more expressive union type inference with no cost, as the untyped tree check won't impact performance or require code changes that could interact with the typer in unexpected ways.
The current checks on whether there is an Or operator on the tpt of ValDef or the tpt of a DefDef is a TypedSplice covers the explicit union ascriptions of terms and functions signatures.

Copy link
Member

@dottybot dottybot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, and thank you for opening this PR! 🎉

All contributors have signed the CLA, thank you! ❤️

Have an awesome day! ☀️

@sjrd
Copy link
Member

sjrd commented Dec 21, 2019

Thanks for your PR. I don't think this is the right approach, though. The fact that the following still doesn't work is a clear sign to me:

val x: B | C = ...
val z = x // We lose the Union ascription as it's not explicit
val error: B | C = z // error as z has lost union ascription

We should actually tag every OrType with a flag that tells whether it was written down or created by the compiler. Then, only widen the ones that were initially created by the compiler. With that strategy, when inferring the type of z, it would receive the type of x which is an OrType with the flag "written by user" on, and hence it would not be widened. It would even keep that flag (since it receives the very same OrType) which means that this can be changed multiple times.

@landerlo
Copy link
Author

I agree that that would be a more desirable end state, but to flag the provenance would require more changes and we would need to deal with the situation of a union growing to too many branches in the disjunction, something that @odersky always tried to avoid. Forcing the union to always be explicit through the ascription deals with that issue in a simple way. If we allow an originally explicit union to flow and grow even when not explicitly written by the user we would need to find other strategies, e.g. having a cut-off size where the union is widened if the disjunction goes over N branches. Again, this seems to require some more discussion.

I'm happy to keep exploring that route and I would love if we had tha discussion as I am a great believer in the potential of union types.

But for the moment I think that merging is a reasonable incremental step, as it gives an instant incremental benefit and the reverting cost is zero as there are no code changes other than that targeted pattern match and widening deferring if.

This simple change unlocks interesting usages of union types, we've seen people encountering this issue. And those extended usages can inform further evolution into a more principled approach like tracking the provenance and making that ascribed union origin flow into other unions that haven't been explicitly written, in case the dotty team wants to follow that route.

@sjrd
Copy link
Member

sjrd commented Dec 21, 2019

I agree that that would be a more desirable end state, but to flag the provenance would require more changes and we would need to deal with the situation of a union growing to too many branches in the disjunction, something that @odersky always tried to avoid. Forcing the union to always be explicit through the ascription deals with that issue in a simple way. If we allow an originally explicit union to flow and grow even when not explicitly written by the user we would need to find other strategies, e.g. having a cut-off size where the union is widened if the disjunction goes over N branches. Again, this seems to require some more discussion.

An originally written union could indeed flow without further explicit types. However it would never be able to grow without explicitly writing the bigger union. Therefore those concerns are also addressed at the core with what I suggest.

Note also that what I suggest has been discussed with @odersky and @smarter, and both agreed that it was a likely solution to all our issues with inference of union types.

But for the moment I think that merging is a reasonable incremental step, as it gives an instant incremental benefit and the reverting cost is zero as there are no code changes other than that targeted pattern match and widening deferring if.

I disagree, but it's not my call to make so you don't need to convince me. Someone from the core dotty team will have to make that call.

@landerlo
Copy link
Author

landerlo commented Dec 21, 2019

Thank you for the feedback.

An originally written union could indeed flow without further explicit types. However it would never be able to grow without explicitly writing the bigger union

I still don't see how the flowing of the original union won't grow into bigger unions not explicitly written if we don't widen based on the flag saying there was an explicitly written union at the origin of the flow.

If the term with a union type goes through expressions that grow the union surely the union might grow into unions not explicitly written. e.g.:

val x: A | B | C = ???
val y: D | E | F = ???

val z = if (true) x else y

Of course this can be controlled if we introduce additional widening cases in addition to checking the flag, but I am unaware of that logic having already been discussed in that level of detail. I must have overlooked it.

@sjrd
Copy link
Member

sjrd commented Dec 21, 2019

Let's note | a union type that was written, and |! a union type that was created by the compiler.

The result of the if expression that you wrote would then be (A | B | C) |! (D | E | F). When deciding the inferred type of z, the compiler would notice that the top-level union was created by the compiler and would therefore widen it. The union would not grow.

@landerlo
Copy link
Author

I see your point. Widening only when the union grows.
This would results in the new semantics of:

val x: A | B | C = ???
val y: D         = ???

val z = x                    // Infers A | B | C
val zz = if (true) x else y  // We lose inference and got Any

The loss of inference of zz still not completely satisfactory but I agree this would be an improvement over my current fix.
I was wary of modifications to the core classes but I'll check what would be the implications of adding the flag to OrType.

@landerlo landerlo force-pushed the fix-#4867 branch 3 times, most recently from 0f1b479 to c7cf9e9 Compare December 28, 2019 22:15
@LPTK
Copy link
Contributor

LPTK commented Dec 30, 2019

If we go down the route suggested by @sjrd, wouldn't it make sense to generalize it to all other widening behaviors in Scala?

For instance, making the following work:

scala> val x: 1 = 1
val x: 1 = 1

scala> val y = x
val y: Int = 1

scala> val z: 1 = y
1 |val z: 1 = y
  |           ^
  |           Found:    (y : Int)
  |           Required: (1 : Int)

If we do not do that in general, then I feel like it's weird to do it only for unions, and I would actually find the behavior originally proposed in this PR less confusing. Indeed, it would mean there is only one automatic widening behavior, instead of two.

@landerlo
Copy link
Author

landerlo commented Jan 3, 2020

I agree it'd be desirable to make it as general as possible, and that inference flowed through assignments. I wanted to keep the scope as minimal as the simple change in the PR unblocks some interesting union usages I've been experimenting with. Keeping the provenance as @sjrd suggests would require more refactoring a and change to OrType to add the explicit ascription flag. Not sure about the widening on the literal types you're bringing up. Maybe @smarter could advise on the preferred approach.

@landerlo landerlo force-pushed the fix-#4867 branch 3 times, most recently from 2708769 to a4b7f1e Compare January 16, 2020 20:56
@smarter
Copy link
Member

smarter commented Jan 19, 2020

Thanks a lot for your PR, I love to see more people experimenting with type inference! :) I broadly agree with @sjrd and @LPTK that what we ultimately want is a more general mechanism which does its best to not widen a user-written type, even if that type ends up in some other place through type inference. (We currently only widen unions and singletons, but if we can make widening better then maybe we could actually go further and also widen the children of a sealed trait or enum which would be really nice I think).

Looking at the cases where inference is improved by your PR, it seems they can be minimized to something like:

class X
class Y
val x: X | Y = identity(if (true) new X else new Y)

but since the logic uses untyped trees it won't help with more complex result types:

class X
class Y

type Id[T] = T
val x: Id[X | Y] = identity(if (true) new X else new Y) // error

on the other hand, if the result type of a val is a union, this PR will keep all unions inferenced in the rhs of that val even if they're unrelated to the result type, this can have consequences on implicit search for example:

class Base
class X extends Base
class Y extends Base

implicit def invBase: Inv[Base] = new Inv[Base]

def getInv[T](x: T)(implicit inv: Inv[T]): Int = 1

// ok on master and this PR
val a: Int = getInv(if (true) new X else new Y)
// ok on master, "no implicit argument of type Inv[X | Y]" with this PR
val b: Int | Any = getInv(if (true) new X else new Y)

Anyway, I think there must be an underlying bug in master here, because even though we widen singleton types in a similar way, the following currently works fine:

val x: 1 = identity(1)
type Id[T] = T
val x: Id[1] = identity(1)

I'd like to take a closer look at widening and type inference in the near future, but before I can get to that, I need to work on some deep bugs in our constraint solving mechanism, so I can't promise any quick fixes.

@landerlo
Copy link
Author

Thank you for the review and feedback, I agree that more generality is desirable. I'll try to put some more thinking into it. The driving motivations for allowing the cases currently supported in this PR is my experimentation with some alternative encodings with union types, examples are in https://github.com/landerlo/fscala19-algapp/ if anybody interested.

liufengyun and others added 11 commits February 5, 2020 11:55
An enum value may have the type `A & B`, in such cases we
need to register for both `A` and `B`.
…mented

The check for a concrete class used to be simply that its `abstractTermMembers`
are empty. However, i7597.scala shows that this is not enough. The problem is
that abstractTermMembers is based on signatures. It will not include a member
as long as there is a concrete member with the same signature.

But as scala#7597 shows it's possible for a concrete member to have the same signature
as an abstract one but still not have a matching type.
Check for duplicate symbols when  creating export forwarders.
Fix scala#8355: REPL tests : fix for two tests failing on Windows
Fix scala#7597: Refine checks whether a deferred term member is implemented
Fix scala#8333: Check for duplicate symbols in exports
GH actions don't digest them well fttb
robstoll and others added 25 commits February 26, 2020 06:50
When a community test fails, re-run it 2 more times to avoid the whole suite failure.
Added a scripted sbt test to check if a @main annotation is detected by sbt
…ined because of a non-literal string parameter
We used to pretty-print trees in the API info we send to sbt, but the
pretty-printed output seems to be unstable leading to overcompilation,
so just use the raw trees instead (this should also be faster).
Avoid overcompilation involving inline or annotation trees
…polator

Fix scala#8362: Fail compilation if a compile time error can't be inlined because of a non-literal string parameter
doc(macros): fix expansion and some improvements
doc(trait param): state which rule is violated
doc(untupling): fix list at the end, add newline
doc(context-functions): typo `are` instead of `is`
doc(operators): typo and add import to examples
Improves the inference of union types by preserving the
union when there is a type ascription with an union.

If the term has not been explicity ascribed with an union then
the existing semantics of joining the orType is maintained.

To determine whether an explicit union ascription exists, a lexical
check on the untyped tree is performed.
Splits union.scala test from negative to negative and positive tests,
exhibiting the preservation of the union when the union has been
explicitly ascribed as per fix scala#4867.
@OlivierBlanvillain
Copy link
Contributor

This PR has been inactive for 2 months so I'll close it for now. @landerlo feel free to reopen in case of further developement!

@smarter
Copy link
Member

smarter commented Apr 3, 2020

Good news: all the examples in this PR should work in the latest nightly thanks to #8635 !

@simlei
Copy link

simlei commented May 14, 2022

Is this issue still on your radar, although it's closed?

@smarter
Copy link
Member

smarter commented May 14, 2022

Which issue? As I noted in my previous comment the examples in this PR should now work. But there are other issues open related to union widening in this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.