-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13789] Infer additional constraints from attribute equality #11618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13789] Infer additional constraints from attribute equality #11618
Conversation
|
cc @nongli |
|
Test build #52779 has finished for PR 11618 at commit
|
1149d20 to
64e3320
Compare
|
Test build #52784 has finished for PR 11618 at commit
|
| case a: Attribute if a.semanticEquals(r) => l | ||
| })) | ||
| case _ => | ||
| Set.empty[Expression] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not foldLeft -- most of the time fold is used it can be rewritten into something imperative that is simpler (and a lot faster)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, fixed!
|
Test build #52819 has finished for PR 11618 at commit
|
| var inferredConstraints = Set.empty[Expression] | ||
| constraints.foreach { | ||
| case eq @ EqualTo(l: Attribute, r: Attribute) => | ||
| inferredConstraints ++= (constraints - eq).map(_ transform { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why semantic equals instead of if a == l?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to infer equality based on the expression ids of their attribute references: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala#L44-L49. Is that not necessary in this case?
|
lgtm |
|
Thanks - merging in master. |
|
Great! Based on this PR, it sounds like I can reopen my PR: #10490 Thanks! |
|
@gatorsmile instead of having a special rule for join, we can probably infer all possible filters based on constraints (... something along the lines of sameeragarwal@ce4c944). This should now subsume the predicate transitivity optimization right? |
|
@sameeragarwal Yeah, it sounds like you already started working on it. True, I also did a similar thing, but we need to add extra handling for predicate push down. How about you first delivering your current changes? I will do the remaining part? Thanks! |
|
Shouldn't the existing rule for |
|
A couple of issues I hit when I try to do it.
Maybe more. Still trying to write more test cases. Hopefully, I can find all of them. Thank you! If you do not mind it, I will try it this weekend. |
|
sounds good, thank you. In my branch, I try to address (2) by not adding new conditions if the child node(s) already have the given constraint. For (3), please note that pushing |
## What changes were proposed in this pull request? This PR adds support for inferring an additional set of data constraints based on attribute equality. For e.g., if an operator has constraints of the form (`a = 5`, `a = b`), we can now automatically infer an additional constraint of the form `b = 5` ## How was this patch tested? Tested that new constraints are properly inferred for filters (by adding a new test) and equi-joins (by modifying an existing test) Author: Sameer Agarwal <sameer@databricks.com> Closes apache#11618 from sameeragarwal/infer-isequal-constraints.
What changes were proposed in this pull request?
This PR adds support for inferring an additional set of data constraints based on attribute equality. For e.g., if an operator has constraints of the form (
a = 5,a = b), we can now automatically infer an additional constraint of the formb = 5How was this patch tested?
Tested that new constraints are properly inferred for filters (by adding a new test) and equi-joins (by modifying an existing test)