-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List patterns: subsumption checking #53891
Conversation
Based on the last LDM, we want to cover nested list patterns in slices - ignoring |
@333fred @jcouv for clarification here. My understanding was that we would think of the above two as the same for purposes of subsumption checking. however, i don't think we stated or made a conclusion on if we'd expect the same codegen. For example, i could imagine different codegen for That said, perhaps havin ghte same codegen is the logic outcome of us effectively saying that we presume that if you implement hte list shape that you are 'well behaved' and we can make these sorts of assumptions. Just wanted to ensure that this was what both of you intended. Thanks! |
I guess that was indeed the intent. But when we're going to assume that the subslice can subsume outer subpatterns, that means we expect values to be the same in which case it would be correct to flatten the list (otherwise the subsumption error is incorrect in the first place). We're treating properties in this manner: again, we assume the implementation is idempotent, that applies to both subsumption checking as well as codegen. The two are actually very much in sync as to what counts as subsumption and what the codegen looks like. |
@alrz I agree that the "well-behaved" assumption would allow us to do some codegen optimizations. What scenarios might we want to optimize?
In reply to: 856929120 In reply to: 856929120 |
This comment has been minimized.
This comment has been minimized.
src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder_ListPatterns.cs
Outdated
Show resolved
Hide resolved
4704ce9
to
6eb6db8
Compare
src/Compilers/CSharp/Test/Semantic/Semantics/PatternMatchingTests_ListPatterns.cs
Show resolved
Hide resolved
Nice. https://github1s.com/dotnet/roslyn/pull/53891 If it only could add review comments.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks (iteration 67) with two nits to double-check with Fred: confirm semantics for new parameters of CheckConsistentDecision
and naming of the "assignment" bound node.
FWIW, I just noticed that we have some tests that directly verify the DAG. Just wanted to mention it as this could be useful for this feature (it's easier to verify that IL). For example, |
1e93434
to
18fcbdb
Compare
Sorry, I can't promise I'll have time to fully review this. It looks like there's a lot of additional code in the branch already that's pretty important to the behavior, but looking through the existing branch code will take a lot more time. |
: !decisionPermitsTrueOther | ||
? AndSequence.Create(Not.Create(AndSequence.Create(precondition, sideeffect)), other) | ||
: AndSequence.Create(OrSequence.Create(Not.Create(precondition), sideeffect), other); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a design review with the dev team on this feature. The goal was mostly education (refresh the team based on what Fred and I now understand about DAG design and explain the changes from this PR) and also allow feedback.
@AlekseyTs raised an interesting idea which I'd like to bounce with you. This is not meant to block this PR, but in case it could spark your imagination with a potentially simpler design.
Instead of injecting preconditions and sideeffects, what if this Filter
method had the following structure:
// step 1: if the length is known to have a single value (**), then rewrite the current test in light of that knowledge
// step 2: existing logic CheckConsistentDecision
followed by simple ternaries (as it existed prior to this PR)
In step 1: if in the current state we know that Length==1 (by looking at remaining values), and the current test/evaluation is regarding an indexer access, then we replace the current test. For example t3 == 42
would be rewritten as t2 == 42
. Basically, from this point on in the DAG we'd be dealing with normalized temps (no longer dealing with t3
, and dealing with t2
instead).
We'd need to think more about how to deal with remaining values in that design. Conceptually a similar approach might work (if the length is known to have a specific value in the current state, then remaining values of certain temps can be merged).
(**) in writing this summary, I realized that identifying states where length gets narrowed to a single value isn't easy. If the selected test is Length == k
then this is easy. But the selected test could be Length >= k
which could narrow the set of remaining values to only k
. So we'd have to compute remaining values first to see if Length
was narrowed to a single value...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewrite the current test in light of that knowledge
we'd have to compute remaining values first
Since we've been through many MANY flaws in the current design already (thanks for that btw!) I think we can easily poke holes in alternative approaches. IIRC we did consider both of these (which are interdependent) but I was convinced it's going to get too subtle. There's just many questions to answer before we even get close to a rough sketch. Here's some examples:
-
What happens if
RemainingTests
is not linear? A precomputed length value may not be always the same for all indexer evaluations e.g. when we haveor
combinators which will cause the dag to split. That would require a lot of plumbing to get right (remember those could come from an explicit Length test in a property pattern so there's infinite possibilities). -
Rewriting tests would have the same problem, plus we may still need the original input for subsequent tests so we should keep track of both somewhere. When you say "from this point on in the dag" that means we need to adjust DagState which has its own way of handling we didn't need to touch in this PR.
-
At a given state, what temps need to be merged exactly? It's possible that the rewrite depends on multiple length values and we can't depend on a direct value test for the reasons mentioned above.
The current approach is formulated inside the decision dag itself, rather than a pre-processing step + maintaining a bunch of additional states to cover all corner cases. So I believe this has made as simple as possible, if we go any further there's a high chance that we could miss something. (edit: I'd be absolutely curious to see this implemented differently, it just took me three months to wrap my head around this one!)
6956051
to
751a18b
Compare
751a18b
to
b1d3a08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks (iteration 71)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 71). We can look at possible simplifications as a followup.
Not sure why |
Seems like a one off problem. It's working fine in other PRs. We can merge over that if it's the only problem. |
🎉🎉🎉 |
Thanks @alrz for pushing through. This was an especially interesting and challenging change. I think the result is pretty fantastic, as it achieves the maximum exhaustiveness/subsumption enforcement that we were hoping for 🎉 |
{ | ||
return this == obj || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just moved to the base; Equals
does its own and then calls into the overrides.
For direct IsEquivalent
usages it's not important because it's rather unlikely to have identical nodes where we use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just moved to the base; Equals does its own and then calls into the overrides.
Because of this change we no longer bypass the following checks when the same instance is compared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public bool Equals(BoundDagEvaluation other)
{
return this == other ||
this.IsEquivalentTo(other) &&
this.Input.Equals(other.Input);
}
I think this implementation does just that if I'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this implementation does just that if I'm not mistaken.
I am not sure what implementation you are looking at, I am commenting on this implementation:
public override bool IsEquivalentTo(BoundDagEvaluation obj)
{
return base.IsEquivalentTo(obj) &&
// base.IsEquivalentTo checks the kind field, so the following cast is safe
this.Index == ((BoundDagIndexEvaluation)obj).Index;
}
I think, the checks after base.IsEquivalentTo(obj)
are performed for the same instance. Whereas before the change we would bypass all checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about BoundDagEvaluation.Equals
which is an existing code path with reference check intact.
A I said earlier, for direct IsEquivalentTo
usages (where we indeed don't have a reference check) most of the time we've already determined that the two are not the same instance, e.g. Equals
just returned false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just revert the check if you feel it's going to make a visible difference. I have no objection on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just revert the check if you feel it's going to make a visible difference. I have no objection on that.
I prefer that we don't remove optimazation just because we assume that APIs are going to be called in certain order. That doesn't feel like the right approoach in the long term.
// Ignore input source as it will be matched in the subsequent iterations. | ||
if (s1Input.IsEquivalentTo(s2Input) && | ||
// We don't want to pair two indices within the same pattern. | ||
s1LengthTemp.Syntax != s2LengthTemp.Syntax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Syntax in bound nodes is just a pointer to related location in source, it is not meant to convey any semantical information. In general, we shouldn't even make assumptions what syntax node a bound node points to. Binding should keep working the same way even when all bound nodes point to the same syntax node, from a dummy tree for example.
Test plan: #51289