-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-39551][SQL] Add AQE invalid plan check #36953
Conversation
case BuildLeft => (b.left, b.right) | ||
case BuildRight => (b.right, b.left) | ||
} | ||
if (!buildPlan.isInstanceOf[BroadcastQueryStageExec]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the BroadcastHashJoinExec is from a sort merge join, it's direct child should be BroadcastExchangeExec ? for example, two joins, the first join is BroadcastHashJoinExec and the second join is sort merge join at first then after re-optimze it changes to broadcast join.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, got it. It checks the BroadcastQueryStageExec
can only appear at the direct child of broadcast join but not check the direct child must be the BroadcastQueryStageExec
.
@ulysses-you You can just add your specific check patterns into |
Yea this is a framework and we can add more checks to it. |
thanks, merging to master/3.3! |
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. No. Added UT. Closes #36953 from maryannxue/validate-aqe. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 58b91b1) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
I made a backport to branch-3.2 for the comment, #37087 (comment) . |
This will be a part of Apache Spark 3.2.2. |
### What changes were proposed in this pull request? This is a backport of #36953 This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. ### Why are the changes needed? AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes #37108 from dongjoon-hyun/SPARK-39551. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This is a backport of apache#36953 This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. ### Why are the changes needed? AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes apache#37108 from dongjoon-hyun/SPARK-39551. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit be891ad) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. No. Added UT. Closes apache#36953 from maryannxue/validate-aqe. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 58b91b1) Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3cf3048)
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. No. Added UT. Closes apache#36953 from maryannxue/validate-aqe. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 58b91b1) Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3cf3048)
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. No. Added UT. Closes apache#36953 from maryannxue/validate-aqe. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 58b91b1) Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3cf3048)
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. No. Added UT. Closes apache#36953 from maryannxue/validate-aqe. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 58b91b1) Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3cf3048)
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan. AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation. No. Added UT. Closes apache#36953 from maryannxue/validate-aqe. Authored-by: Maryann Xue <maryann.xue@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 58b91b1) Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 3cf3048)
What changes were proposed in this pull request?
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.
Why are the changes needed?
AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g.,
BroadcastExchangeExec
can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added UT.