-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13484] [SQL] Prevent illegal NULL propagation when filtering outer-join results #13290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #59259 has finished for PR 13290 at commit
|
|
Test build #59261 has finished for PR 13290 at commit
|
| val childrenOutput = q.children.flatMap(c => c.output).groupBy(_.exprId).flatMap { | ||
| case (exprId, attributes) => | ||
| // If there are multiple Attributes having the same ExpirId, we need to resolve | ||
| // the conflict of nullable field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this case is a possible state, output attributes with the same exprId and different nullability in children?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it is not very possible. Let me think about it more.
|
Test build #59334 has finished for PR 13290 at commit
|
|
test this please |
|
Test build #59658 has finished for PR 13290 at commit
|
| case p if !p.resolved => p // Skip unresolved nodes. | ||
| case p: LogicalPlan if p.resolved => | ||
| val childrenOutput = p.children.flatMap(c => c.output).groupBy(_.exprId).flatMap { | ||
| case (exprId, attributes) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will this happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with our current implementation, it will not happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then should we just put an assert/require here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we should add assert. Even when we hit that case, it is still fine to pass at here, right?
| val childrenOutput = p.children.flatMap(c => c.output).groupBy(_.exprId).flatMap { | ||
| case (exprId, attributes) => | ||
| // If there are multiple Attributes having the same ExprId, we need to resolve | ||
| // the conflict of nullable field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe attributes.exist(_.nullable)?
|
LGTM except for a few minor comments. |
|
@yhuai LGTM. Could you add 'Close #113711' in a commit log if this pr merged into master? |
|
LGTM |
|
Test build #59780 has finished for PR 13290 at commit
|
|
Merging to master and branch-2.0. |
…ter-join results ## What changes were proposed in this pull request? This PR add a rule at the end of analyzer to correct nullable fields of attributes in a logical plan by using nullable fields of the corresponding attributes in its children logical plans (these plans generate the input rows). This is another approach for addressing SPARK-13484 (the first approach is #11371). Close #113711 Author: Takeshi YAMAMURO <linguin.m.s@gmail.com> Author: Yin Huai <yhuai@databricks.com> Closes #13290 from yhuai/SPARK-13484. (cherry picked from commit 5eea332) Signed-off-by: Cheng Lian <lian@databricks.com>
What changes were proposed in this pull request?
This PR add a rule at the end of analyzer to correct nullable fields of attributes in a logical plan by using nullable fields of the corresponding attributes in its children logical plans (these plans generate the input rows).
This is another approach for addressing SPARK-13484 (the first approach is #11371).
Close #113711