Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List patterns: Skip compiler-generated dag nodes for better codegen #57909
List patterns: Skip compiler-generated dag nodes for better codegen #57909
Changes from 6 commits
1704778
04cd111
cda1fed
4a02053
2231d67
b008d87
5794267
669917a
6f5536a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming we have tests that confirm significance of this operation. Could you provide an example of an affected scenario and how the decompiled code looks like for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler-generated length test (L=1) is merged with the length test from the second pattern.
Not doing so would eliminate the whenTrue branch and cause a false subsumption error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought
WasCompilerGenerated
flag didn't have any impact on subsumption checking. Is this incorrect?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you think of a scenario when leaving this node won't be useful at the end? For example, when
length
is explicitly checked. Something like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't, but since we're removing nodes based on it, it should not be set where it's not supposed to.
I'm gonna go ahead and make the change to not depend on WasCompilerGenerated. I think that'll make all this more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is quite possible that the scenario doesn't lead to the duplication because one of the branches is optimized away. Basically we determined that either TrueBranch or FalseBranch is definitely false. However, I don't see anything in the implementation that would guarantee that we never end up with two branches alive.
To clarify, I am not looking for ways to improve the code. I am looking for a definitive proof that the code is "better" than the one we would produce without the list pattern subsumption checks in place. This means that we have to compare two code gen strategies and see that the one with artifacts is definitely better. Perhaps even to the point that we would want to implement it if the subsumption checking wouldn't give it to us for free. If there is no proof like that, then I think that it is better to generate code from a Dug that doesn't have any artifact. Instead of marking things and then hoping they can be removed safely and hoping for the best if they couldn't be removed.
This is definitely an option if we cannot find a way to achieve the same in some "incremental" fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned, the current dag can assume the result of tests in certain branches depending on the length value. One possible definition of "better code" would be "fewer number of tests before we jump to a leaf node" and by this definition we're emitting a better code.
I think the fact that we can't optimally handle such branches is a general issue with the existing pattern-matching machinery. We use a simple left-to-right heuristic which may not result in optimal code with certain trees. See #29033.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a theory or a hope. This must be proven.
Again, I am not going for the most optimal code gen. I simply would like to see a confirmation that there is a good reason to keep any artifacts of the list pattern subsumption checks in the generated code. We are complicating implementation, making an assumption that we are able to accurately detect "unremovable" artifacts. We are also complicating the future maintenance of the machinery, we have to make sure that future changes do not invalidate the assumption, etc. All this comes with an additional risk and cost. Why do we want to take it? Is there a good reason for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply removing all the artifacts won't work as they're practically part of the input now and the resulting dag depends on them to occur at some point during execution. We can, however, short-circuit some of them that aren't crucial to the result.
I think we have two options moving forward:
What is your recommendation on approaching this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @alrz. We will discuss this internally and will get back to you early next week.