-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22301][SQL] Add rule to Optimizer for In with not-nullable value and empty list #19523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| override def children: Seq[Expression] = value +: list | ||
| lazy val inSetConvertible = list.forall(_.isInstanceOf[Literal]) | ||
| lazy val isListEmpty = list.isEmpty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this val?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using it to be consistent with the current implementation (see the line above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call list.isEmpty is, in comparison, fast and constant time. It doesn't save much of anything to cache it, and the overhead of a lazy val
|
|
||
| case In(_: AttributeReference, list: Seq[Expression]) if list.isEmpty => Literal.FalseLiteral | ||
| // We rely on the optimizations in org.apache.spark.sql.catalyst.optimizer.OptimizeIn | ||
| // to be sure that the list cannot be empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this comment is not accurate, since in optimizer we only deal with the case attribute is not nullable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this line after we merge this PR #19522
|
ok to test |
|
@mgaido91 Could you update the PR title? |
|
Test build #82923 has finished for PR 19523 at commit
|
|
retest this please |
|
Test build #82928 has finished for PR 19523 at commit
|
|
Test build #83011 has finished for PR 19523 at commit
|
|
LGTM |
|
Thanks! Merged to master. |
What changes were proposed in this pull request?
For performance reason, we should resolve in operation on an empty list as false in the optimizations phase, ad discussed in #19522.
How was this patch tested?
Added UT
cc @gatorsmile