-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix an issue in InlineProjections that may cause query failure when multiple column masks exist on a table #12262
Conversation
53fe8d8
to
31dfa0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
% comments
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/InlineProjections.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/plan/Assignments.java
Outdated
Show resolved
Hide resolved
28e4d26
to
600b3a7
Compare
When there are multiple column masks on a table, it is possible to have symbols in child projection that don't qualify for inlining. If that's the case, we should keep their original assignment instead of resetting them to identity projection.
22e3ab0
to
4b6d7d1
Compare
Merged thanks. |
This issue was originally caused by the way column masks are planned. While planning column masks, we add projections which change the semantics of the masked symbols:
This seems brittle, because different plan transformations we have in Trino span across sequences of plan nodes, and they assume that a symbol consistently has the same semantics. I would not be surprised if we had silent correctness issues resulting from that. Another issue with planning column masks this way is the notion of "sequential masking". The masks which are applied later "see" the other columns already masked (but maybe it is the way it should be?). Consider this example:
In this example, the mask on The solution to both issues would be to plan masks so that the resulting symbol is a new symbol. |
Thanks @kasiafi for pointing that out. For your second note on "sequential masking", I agree there is a mask dependency issue, which I'm not sure what is the correct/compliant behavior. What's worse is if the sequence of masking is performed in an non-deterministic way, the query result will be inconsistent as well. |
Each mask is like a view. If you have multiple masks then you have recursive view that reads from view that reads from a table. So each mask is applied independently. If you have a table with columns a, b, and two masks functions that Surely the order is important and engine applies them in the order that was returned by the plugin. Not sure how much test coverage we have here. |
assertThat(assertions.query(query)) | ||
.matches("VALUES (CAST('***' as varchar(79)), 'O', CAST('***#000000951' as varchar(15)))"); | ||
|
||
// Mask "comment" and "orderstatus" using "clerk" ("clerk" appears between "orderstatus" and "comment" in table definition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of columns shouldn't determine any mask/filter related behavior.
I don't understand this comment, but it looks like explaining why the expected behavior is what it is.
If the expected behavior was identified as dependent on the column order in the table definition, this comment should be a TODO + link to a github issue.
see also #14420 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments on the order of column definitions do not seem related. The tests don't show any dependency between the column order and the effective semantics of the masks. @weiatwork, am I correct? What's the purpose of those comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @kokosing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kasiafi That's right. That test didn't demonstrate any "column mask dependent" masking scenario, only trying to avoid the previously not so well understood behavior.
Since now I'm more confident that this behavior is problematic, I made adjustment to the test case and intentionally demo that there shouldn't be dependencies after the fix: https://github.com/trinodb/trino/pull/14420/files#diff-637b4407daa07892834452da6da9e755fd1dfb72813298df590c7fe019796249R849
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weiatwork please see my comment #14420 (comment).
If the behavior of the masks does not depend on the order of column definitions, then it would be worthy to remove the related part of the comments, e.g. ("clerk" appears between "orderstatus" and "comment" in table definition)
, as they might mislead someone into thinking that there is a dependency between mask semantics and column def order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is dependency. Now I realize there shouldn't be any dependency among masks
Description
Related issues, pull requests, and links
I incorporated unit test cases from #10437. The change in #10437 can be seen as an improvement to avoid unnecessary Project node stacking, but the root cause is how we handle multiple Project nodes when performing symbol inlining.
Fixes #10370
Documentation
(X) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(X) Release notes entries required with the following suggested text: