List patterns: Skip compiler-generated dag nodes for better codegen #57909

alrz · 2021-11-21T10:50:18Z

To facilitate subsumption-checking for list-patterns, some nodes may be inserted into the dag during construction, namely BoundDagAssignmentEvaluation and a BoundDagValueTest on the length temp. These nodes drive the general shape of the dag and help to determine unreachable states. After that, those will not play any role in subsequent phases specially in lowering which result in suboptimal codegen as demonstrated in the issue.

BoundDagAssignmentEvaluation was already skipped during DAG lowering, with this change it won't reach codegen at all.

For BoundDagValueTest we need to do additional analysis, because when the condition is actually turn out to be significant we still want to keep it in the final DAG. Otherwise we short-circuit to the "false" branch which is the default as if we never inserted such test (note that when the true branch is important we have unset the flag earlier).

Decompiled examples from the issue

  private static int Test1(int[] x)
  {
    if (x != null)
    {
      int length = x.Length;
      if (length >= 1 && x[length - 1] == 1 && x[0] == 1)
      {
        return 0;
      }
    }
    return 1;
  }

  private static int Test2(int[] x)
  {
    if (x != null)
    {
      int length = x.Length;
      if (length >= 1 && x[0] == 2 && x[length - 1] == 1)
      {
        return 0;
      }
    }
    return 3;
  }

  private static int Test3(int[] x)
  {
    if (x != null)
    {
      int length = x.Length;
      if (length >= 1)
      {
        if (x[0] == 2)
        {
          return 4;
        }
        if (x[length - 1] == 1)
        {
          return 5;
        }
      }
    }
    return 3;
  }

  private static int Test4(int[] x)
  {
    if (x != null)
    {
      int length = x.Length;
      if (length >= 1)
      {
        int num = x[0];
        if (num == 2)
        {
          return 4;
        }
        int num2 = x[length - 1];
        if (num2 == 1)
        {
          return 5;
        }
        if (num == 6 && num2 == 7)
        {
          return 8;
        }
      }
    }
    return 3;
  }

Relates to test plan #51289

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs

AlekseyTs · 2021-11-22T16:42:58Z

@alrz Consider adding more details about the change to the description. What nodes are skipped, why it is the right thing to do, etc.

In reply to: 975716098

alrz · 2021-11-23T11:58:52Z

Updated. Let me know if it doesn't clearly reflect the reasoning behind the change.

In reply to: 976443029

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter.DecisionDagRewriter.cs

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_IsPatternOperator.cs

jcouv

Done with review pass (iteration 2). Only minor comments

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs

jcouv · 2021-11-24T20:07:06Z

FYI, I retargeted this PR to main branch.

In reply to: 978185113

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs

AlekseyTs · 2021-11-29T18:44:36Z

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs

+                            {
+                                // We're going to remove compiler-generated nodes from dag right after construction.
+                                // If we have found an explicit value test we need to unset the flag to preserve it.
+                                state.SelectedTest = new BoundDagValueTest(v.Syntax, v.Value, v.Input, v.HasErrors);


state.SelectedTest = new BoundDagValueTest(v.Syntax, v.Value, v.Input, v.HasErrors);

I am assuming we have tests that confirm significance of this operation. Could you provide an example of an affected scenario and how the decompiled code looks like for it?

The compiler-generated length test (L=1) is merged with the length test from the second pattern.

switch (a) { case [.., 1]: case [2]: return 0; }

if (a != null) { int length = a.Length; if (length >= 1 && (a[length - 1] == 1 || (length == 1 && a[0] == 2))) { return 0; } }

Not doing so would eliminate the whenTrue branch and cause a false subsumption error.

Not doing so would eliminate the whenTrue branch and cause a false subsumption error.

I thought WasCompilerGenerated flag didn't have any impact on subsumption checking. Is this incorrect?

Can you think of a scenario when leaving this node won't be useful at the end? For example, when length is explicitly checked. Something like:

switch (a) { case [.., 1]: case [2, ..] and {Length: <3 or > 5}: return 0; }

It doesn't, but since we're removing nodes based on it, it should not be set where it's not supposed to.
I'm gonna go ahead and make the change to not depend on WasCompilerGenerated. I think that'll make all this more clear.

I don't think we can avoid duplicated nodes. Ignoring unused evaluations, what other improvements you think could be made to this code?

It is quite possible that the scenario doesn't lead to the duplication because one of the branches is optimized away. Basically we determined that either TrueBranch or FalseBranch is definitely false. However, I don't see anything in the implementation that would guarantee that we never end up with two branches alive.

To clarify, I am not looking for ways to improve the code. I am looking for a definitive proof that the code is "better" than the one we would produce without the list pattern subsumption checks in place. This means that we have to compare two code gen strategies and see that the one with artifacts is definitely better. Perhaps even to the point that we would want to implement it if the subsumption checking wouldn't give it to us for free. If there is no proof like that, then I think that it is better to generate code from a Dug that doesn't have any artifact. Instead of marking things and then hoping they can be removed safely and hoping for the best if they couldn't be removed.

we might as well construct a new dag dedicated to lowering.

This is definitely an option if we cannot find a way to achieve the same in some "incremental" fashion.

I am looking for a definitive proof that the code is "better" than the one we would produce without the list pattern subsumption checks in place.

As you mentioned, the current dag can assume the result of tests in certain branches depending on the length value. One possible definition of "better code" would be "fewer number of tests before we jump to a leaf node" and by this definition we're emitting a better code.

I don't see anything in the implementation that would guarantee that we never end up with two branches alive.

I think the fact that we can't optimally handle such branches is a general issue with the existing pattern-matching machinery. We use a simple left-to-right heuristic which may not result in optimal code with certain trees. See #29033.

One possible definition of "better code" would be "fewer number of tests before we jump to a leaf node" and by this definition we're emitting a better code.

This is a theory or a hope. This must be proven.

I think the fact that we can't optimally handle such branches is a general issue with the existing pattern-matching machinery. We use a simple left-to-right heuristic which may not result in optimal code with certain trees. See #29033.

Again, I am not going for the most optimal code gen. I simply would like to see a confirmation that there is a good reason to keep any artifacts of the list pattern subsumption checks in the generated code. We are complicating implementation, making an assumption that we are able to accurately detect "unremovable" artifacts. We are also complicating the future maintenance of the machinery, we have to make sure that future changes do not invalidate the assumption, etc. All this comes with an additional risk and cost. Why do we want to take it? Is there a good reason for that?

Simply removing all the artifacts won't work as they're practically part of the input now and the resulting dag depends on them to occur at some point during execution. We can, however, short-circuit some of them that aren't crucial to the result.

I think we have two options moving forward:

Move the logic to detect unavoidable artifacts to a later step maybe based on dag nodes themselves instead of looking for a "matching source test" which I take it is vaguely defined.

Re-compute the dag for lowering. Maybe we can reuse some of the nodes but to my understanding, we can't do it as a "modification" to the existing graph. The two could be very different in shape to be able to "translate back" to the original.

What is your recommendation on approaching this?

What is your recommendation on approaching this?

Thank you, @alrz. We will discuss this internally and will get back to you early next week.

AlekseyTs · 2021-11-29T18:45:16Z

Done with review pass (commit 6)

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs

AlekseyTs · 2021-12-13T21:52:00Z

@alrz After an offline discussion within the team, we decided that the solution we are most comfortable with at the moment is to build a dedicated Dag for the purpose of the code gen if the Dag built for the purpose of subsumption checking contains nodes synthesized solely for the purpose of list pattern subsumption checking. That means, that users utilizing pattern matching, but not utilizing list patterns specifically, won't pay any noticeable penalty due to the list patterns feature.

alrz · 2021-12-17T07:24:37Z

I'll hold this off until the open issue about assumptions around slice result is resolved. At the moment it's not clear to me if we should only avoid synthesizing tests for lowering or we should bail for any non-identical indexers entirely.

jcouv · 2021-12-18T01:43:01Z

the open issue about assumptions around slice result is resolved.

The topic is scheduled for LDM right after New Year. Based on discussion so far, the leaning is towards having the diagnostics behave as-if Slice could never return null, but have the codegen include the null check.
A design with two DAGs (one for subsumption/exhaustiveness diagnostics, and one for codegen) makes that behavior possible. But the Slice question may imply that we need to keep both DAGs around longer, so that nullability analysis can run on the former DAG (not the codegen DAG).

alrz · 2021-12-21T20:30:48Z

(not the codegen DAG)

If you mean we should run the nullability analysis on the dag that "contains" the null test on the slice result (to preserve the current behavior), that'll be the codegen dag actually, however, there's a problem:

For the first pass, we skip the null check on the slice and relate alternative indexers.
For codegen, we keep the null check on the slice but avoid synthesizing any tests.

Neither of these graphs would be equivalent to the current one so nullability will be affected (albeit, not of the slice result if we pick the latter, rather, of any nested list subpatterns that we have determined to be related in the former dag.)

alrz · 2022-01-22T13:13:16Z

@AlekseyTs @jcouv

the solution we are most comfortable with at the moment is to build a dedicated Dag for the purpose of the code gen if the Dag built for the purpose of subsumption checking contains nodes synthesized solely for the purpose of list pattern subsumption checking.

While this is unlikely to change the program output, I'm wondering if this could cause a change in order of evaluation, particularly around when expressions. I'm yet to come up with an example where this is apparent but on the surface, imagine due to the implied result of the next case, we directly jump to the when clause. That would not be the case if we did not consider alternative indexers to be related - we will evaluate the pattern first and only then the when clause is evaluated. This could get worse when multiple when clauses are involved.

alrz · 2022-01-22T18:24:37Z

Closing in favor of #59019

Skip compiler-generated dag nodes

1704778

alrz requested a review from a team as a code owner November 21, 2021 10:50

ghost added the Community The pull request was submitted by a contributor who is not a Microsoft employee. label Nov 21, 2021

dotnet-issue-labeler bot added the Area-Compilers label Nov 21, 2021

alrz requested review from jcouv and AlekseyTs November 21, 2021 10:50

jcouv self-assigned this Nov 21, 2021

Extract method

04cd111

alrz commented Nov 21, 2021

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs Outdated Show resolved Hide resolved

runfoapp bot mentioned this pull request Nov 22, 2021

Test AsyncIteratorWithAwaitAndYieldAndAwait is flaky #57797

Closed

jcouv reviewed Nov 23, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs Outdated Show resolved Hide resolved

jcouv reviewed Nov 23, 2021

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs Outdated Show resolved Hide resolved

jcouv reviewed Nov 23, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter.DecisionDagRewriter.cs Show resolved Hide resolved

jcouv reviewed Nov 23, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_IsPatternOperator.cs Show resolved Hide resolved

jcouv reviewed Nov 23, 2021

View reviewed changes

PR Feedback

cda1fed

jcouv reviewed Nov 24, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs Outdated Show resolved Hide resolved

jcouv changed the base branch from features/list-patterns to main November 24, 2021 20:06

runfoapp bot mentioned this pull request Nov 24, 2021

Failure in CSharpCodeActions.GFUFuzzyMatchAfterRenameTrackingAndAfterGenerateType #57423

Closed

alrz added 3 commits November 25, 2021 21:42

Merge remote-tracking branch 'origin/main' into list-patterns-simp

4a02053

Fixup

2231d67

Typo

b008d87

runfoapp bot mentioned this pull request Nov 25, 2021

Flaky test: Roslyn.VisualStudio.IntegrationTests.CSharp.CSharpCodeActions.FastDoubleInvoke #57551

Closed

AlekseyTs reviewed Nov 29, 2021

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs Outdated Show resolved Hide resolved

AlekseyTs reviewed Nov 29, 2021

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs Outdated Show resolved Hide resolved

AlekseyTs reviewed Nov 29, 2021

View reviewed changes

Do not depend on WasCompilerGenerated flag

5794267

jcouv added the Feature - List Patterns label Dec 2, 2021

This was referenced Dec 2, 2021

[Flaky Test] Roslyn.VisualStudio.IntegrationTests.CSharp.CSharpCodeActions.GenerateMethodInClosedFile times out #57722

Closed

Microsoft.CodeAnalysis.CSharp.CommandLine.UnitTests.CommandLineTests.ArgumentParsing is flaky #58077

Open

alrz commented Dec 2, 2021

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs Outdated Show resolved Hide resolved

Simplify

669917a

alrz commented Dec 3, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs Show resolved Hide resolved

AlekseyTs reviewed Dec 3, 2021

View reviewed changes

src/Compilers/CSharp/Portable/BoundTree/BoundDecisionDag.cs Outdated Show resolved Hide resolved

alrz commented Dec 6, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs Outdated Show resolved Hide resolved

alrz mentioned this pull request Dec 6, 2021

BoundDecisionDag.Rewrite - Avoid capturing the replacement map #58137

Merged

AlekseyTs reviewed Dec 7, 2021

View reviewed changes

src/Compilers/CSharp/Portable/Binder/DecisionDagBuilder.cs Outdated Show resolved Hide resolved

alrz marked this pull request as draft December 7, 2021 19:50

Merge remote-tracking branch 'origin/main' into list-patterns-simp

6f5536a

alrz force-pushed the list-patterns-simp branch from aab11f4 to 6f5536a Compare December 8, 2021 23:59

alrz mentioned this pull request Jan 22, 2022

List patterns: Recreate the decision dag for lowering #59019

Merged

alrz closed this Jan 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List patterns: Skip compiler-generated dag nodes for better codegen #57909

List patterns: Skip compiler-generated dag nodes for better codegen #57909

alrz commented Nov 21, 2021 •

edited

Loading

AlekseyTs commented Nov 22, 2021 •

edited by jcouv

Loading

alrz commented Nov 23, 2021 •

edited by jcouv

Loading

jcouv left a comment

jcouv commented Nov 24, 2021 •

edited

Loading

AlekseyTs Nov 29, 2021

alrz Dec 2, 2021 •

edited

Loading

AlekseyTs Dec 2, 2021

AlekseyTs Dec 2, 2021 •

edited

Loading

alrz Dec 2, 2021

AlekseyTs Dec 8, 2021

alrz Dec 9, 2021 •

edited

Loading

AlekseyTs Dec 9, 2021 •

edited

Loading

alrz Dec 10, 2021 •

edited

Loading

AlekseyTs Dec 11, 2021

AlekseyTs commented Nov 29, 2021

AlekseyTs commented Dec 13, 2021

alrz commented Dec 17, 2021 •

edited

Loading

jcouv commented Dec 18, 2021

alrz commented Dec 21, 2021 •

edited

Loading

alrz commented Jan 22, 2022 •

edited

Loading

alrz commented Jan 22, 2022

List patterns: Skip compiler-generated dag nodes for better codegen #57909

List patterns: Skip compiler-generated dag nodes for better codegen #57909

Conversation

alrz commented Nov 21, 2021 • edited Loading

AlekseyTs commented Nov 22, 2021 • edited by jcouv Loading

alrz commented Nov 23, 2021 • edited by jcouv Loading

jcouv left a comment

Choose a reason for hiding this comment

jcouv commented Nov 24, 2021 • edited Loading

AlekseyTs Nov 29, 2021

Choose a reason for hiding this comment

alrz Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

AlekseyTs Dec 2, 2021

Choose a reason for hiding this comment

AlekseyTs Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

alrz Dec 2, 2021

Choose a reason for hiding this comment

AlekseyTs Dec 8, 2021

Choose a reason for hiding this comment

alrz Dec 9, 2021 • edited Loading

Choose a reason for hiding this comment

AlekseyTs Dec 9, 2021 • edited Loading

Choose a reason for hiding this comment

alrz Dec 10, 2021 • edited Loading

Choose a reason for hiding this comment

AlekseyTs Dec 11, 2021

Choose a reason for hiding this comment

AlekseyTs commented Nov 29, 2021

AlekseyTs commented Dec 13, 2021

alrz commented Dec 17, 2021 • edited Loading

jcouv commented Dec 18, 2021

alrz commented Dec 21, 2021 • edited Loading

alrz commented Jan 22, 2022 • edited Loading

alrz commented Jan 22, 2022

alrz commented Nov 21, 2021 •

edited

Loading

AlekseyTs commented Nov 22, 2021 •

edited by jcouv

Loading

alrz commented Nov 23, 2021 •

edited by jcouv

Loading

jcouv commented Nov 24, 2021 •

edited

Loading

alrz Dec 2, 2021 •

edited

Loading

AlekseyTs Dec 2, 2021 •

edited

Loading

alrz Dec 9, 2021 •

edited

Loading

AlekseyTs Dec 9, 2021 •

edited

Loading

alrz Dec 10, 2021 •

edited

Loading

alrz commented Dec 17, 2021 •

edited

Loading

alrz commented Dec 21, 2021 •

edited

Loading

alrz commented Jan 22, 2022 •

edited

Loading