[SPARK-17897] [SQL] [BACKPORT-2.0] Fixed IsNotNull Constraint Inference Rule #16894
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR is to backport #16067 to Spark 2.0
The
constraintsof an operator is the expressions that evaluate totruefor all the rows produced. That means, the expression result should be neitherfalsenorunknown(NULL). Thus, we can conclude thatIsNotNullon all the constraints, which are generated by its own predicates or propagated from the children. The constraint can be a complex expression. For better usage of these constraints, we try to push downIsNotNullto the lowest-level expressions (i.e.,Attribute).IsNotNullcan be pushed through an expression when it is null intolerant. (When the input is NULL, the null-intolerant expression always evaluates to NULL.)Below is the existing code we have for
IsNotNullpushdown.IsNotNullitself is not null-intolerant. It convertsnulltofalse. If the expression does not include anyNot-like expression, it works; otherwise, it could generate a wrong result. This PR is to fix the above function by removing theIsNotNullfrom the inference. After the fix, when a constraint has aIsNotNullexpression, we infer new attribute-specificIsNotNullconstraints if and only ifIsNotNullappears in the root.Without the fix, the following test case will return empty.
Before the fix, the optimized plan is like
After the fix, the optimized plan is like
How was this patch tested?
Added a test