-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-2052] [SQL] Add optimization for CaseConversionExpression's. #990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - do you have a use case for this rule? I wonder how often it happens in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I just cared that the conversions were used redundantly like UPPER(LOWER(str)), which would be a mistake in most case, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually have queries that trigger this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use UPPER(LOWER(str)) or UPPER(UPPER(str)), or something like those.
|
This looks good to me. Can you add some unit tests for collapsing case statements? |
|
Thank you for your comments. |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case would be more generally handled by constant folding. Can you override foldable in these expressions to be true if the child is foldable? Also please fix the toString methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the two points.
Should I remove the mentioned lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please.
On Jun 9, 2014 8:16 PM, "Takuya UESHIN" notifications@github.com wrote:
In
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:@@ -154,6 +155,10 @@ object NullPropagation extends Rule[LogicalPlan] {
case left :: Literal(null, _) :: Nil => Literal(null, e.dataType)
case _ => e
}
case e: CaseConversionExpression => e.children match {Fixed the two points.
Should I remove the mentioned lines?—
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/990/files#r13576056.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I do.
BTW, I noticed that there are some Expressions in the same situation like UnaryMinus, Cast, or Not, ( or BinaryArithmetic, BinaryComparison, StringRegexExpression, too ? )
I think they also can be handled by constant folding.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Can you double check that constant folding works as expected and then remove the unneeded rules?
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15597/ |
|
Merged build triggered. |
|
@marmbrus I checked and remove the unneeded rules from |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15606/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only foldable because both values are literals. if you make one of them an attribute (e.g., 'a.int) then it won't fold anymore and we'll need the NullPropagation rule still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, that's right!
I'll move them back.
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15671/ |
|
Hey, this isn't mergable anymore. Mind rebasing? |
…move the unneeded rules from NullPropagation.
|
Rebased, thanks. |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
Add optimization for `CaseConversionExpression`'s. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #990 from ueshin/issues/SPARK-2052 and squashes the following commits: 2568666 [Takuya UESHIN] Move some rules back. dde7ede [Takuya UESHIN] Add tests to check if ConstantFolding can handle null literals and remove the unneeded rules from NullPropagation. c4eea67 [Takuya UESHIN] Fix toString methods. 23e2363 [Takuya UESHIN] Make CaseConversionExpressions foldable if the child is foldable. 0ff7568 [Takuya UESHIN] Add tests for collapsing case statements. 3977d80 [Takuya UESHIN] Add optimization for CaseConversionExpression's. (cherry picked from commit 9a2448d) Signed-off-by: Michael Armbrust <michael@databricks.com>
|
Thanks! Merged into master and 1.0. |
Add optimization for `CaseConversionExpression`'s. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes apache#990 from ueshin/issues/SPARK-2052 and squashes the following commits: 2568666 [Takuya UESHIN] Move some rules back. dde7ede [Takuya UESHIN] Add tests to check if ConstantFolding can handle null literals and remove the unneeded rules from NullPropagation. c4eea67 [Takuya UESHIN] Fix toString methods. 23e2363 [Takuya UESHIN] Make CaseConversionExpressions foldable if the child is foldable. 0ff7568 [Takuya UESHIN] Add tests for collapsing case statements. 3977d80 [Takuya UESHIN] Add optimization for CaseConversionExpression's.
Add optimization for `CaseConversionExpression`'s. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes apache#990 from ueshin/issues/SPARK-2052 and squashes the following commits: 2568666 [Takuya UESHIN] Move some rules back. dde7ede [Takuya UESHIN] Add tests to check if ConstantFolding can handle null literals and remove the unneeded rules from NullPropagation. c4eea67 [Takuya UESHIN] Fix toString methods. 23e2363 [Takuya UESHIN] Make CaseConversionExpressions foldable if the child is foldable. 0ff7568 [Takuya UESHIN] Add tests for collapsing case statements. 3977d80 [Takuya UESHIN] Add optimization for CaseConversionExpression's.
* LimitPushDownThroughWindow * fix * fix
Co-authored-by: Egor Krivokon <>
Co-authored-by: Egor Krivokon <>
Add optimization for
CaseConversionExpression's.