Skip to content

Conversation

@ueshin
Copy link
Member

@ueshin ueshin commented Jun 6, 2014

Add optimization for CaseConversionExpression's.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15495/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - do you have a use case for this rule? I wonder how often it happens in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I just cared that the conversions were used redundantly like UPPER(LOWER(str)), which would be a mistake in most case, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually have queries that trigger this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use UPPER(LOWER(str)) or UPPER(UPPER(str)), or something like those.

@rxin
Copy link
Contributor

rxin commented Jun 9, 2014

This looks good to me. Can you add some unit tests for collapsing case statements?

@ueshin
Copy link
Member Author

ueshin commented Jun 9, 2014

Thank you for your comments.
I added some tests.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15561/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case would be more generally handled by constant folding. Can you override foldable in these expressions to be true if the child is foldable? Also please fix the toString methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the two points.
Should I remove the mentioned lines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please.
On Jun 9, 2014 8:16 PM, "Takuya UESHIN" notifications@github.com wrote:

In
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:

@@ -154,6 +155,10 @@ object NullPropagation extends Rule[LogicalPlan] {
case left :: Literal(null, _) :: Nil => Literal(null, e.dataType)
case _ => e
}

  •  case e: CaseConversionExpression => e.children match {
    

Fixed the two points.
Should I remove the mentioned lines?


Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/990/files#r13576056.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I do.
BTW, I noticed that there are some Expressions in the same situation like UnaryMinus, Cast, or Not, ( or BinaryArithmetic, BinaryComparison, StringRegexExpression, too ? )
I think they also can be handled by constant folding.
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Can you double check that constant folding works as expected and then remove the unneeded rules?

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15597/

@AmplabJenkins
Copy link

Merged build triggered.

@ueshin
Copy link
Member Author

ueshin commented Jun 10, 2014

@marmbrus I checked and remove the unneeded rules from NullPropagation.
Could you please check the changes?

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15606/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only foldable because both values are literals. if you make one of them an attribute (e.g., 'a.int) then it won't fold anymore and we'll need the NullPropagation rule still.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, that's right!
I'll move them back.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15671/

@marmbrus
Copy link
Contributor

Hey, this isn't mergable anymore. Mind rebasing?

@ueshin
Copy link
Member Author

ueshin commented Jun 11, 2014

Rebased, thanks.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15693/

asfgit pushed a commit that referenced this pull request Jun 12, 2014
Add optimization for `CaseConversionExpression`'s.

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes #990 from ueshin/issues/SPARK-2052 and squashes the following commits:

2568666 [Takuya UESHIN] Move some rules back.
dde7ede [Takuya UESHIN] Add tests to check if ConstantFolding can handle null literals and remove the unneeded rules from NullPropagation.
c4eea67 [Takuya UESHIN] Fix toString methods.
23e2363 [Takuya UESHIN] Make CaseConversionExpressions foldable if the child is foldable.
0ff7568 [Takuya UESHIN] Add tests for collapsing case statements.
3977d80 [Takuya UESHIN] Add optimization for CaseConversionExpression's.

(cherry picked from commit 9a2448d)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 9a2448d Jun 12, 2014
@marmbrus
Copy link
Contributor

Thanks!

Merged into master and 1.0.

pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Add optimization for `CaseConversionExpression`'s.

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes apache#990 from ueshin/issues/SPARK-2052 and squashes the following commits:

2568666 [Takuya UESHIN] Move some rules back.
dde7ede [Takuya UESHIN] Add tests to check if ConstantFolding can handle null literals and remove the unneeded rules from NullPropagation.
c4eea67 [Takuya UESHIN] Fix toString methods.
23e2363 [Takuya UESHIN] Make CaseConversionExpressions foldable if the child is foldable.
0ff7568 [Takuya UESHIN] Add tests for collapsing case statements.
3977d80 [Takuya UESHIN] Add optimization for CaseConversionExpression's.
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Add optimization for `CaseConversionExpression`'s.

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes apache#990 from ueshin/issues/SPARK-2052 and squashes the following commits:

2568666 [Takuya UESHIN] Move some rules back.
dde7ede [Takuya UESHIN] Add tests to check if ConstantFolding can handle null literals and remove the unneeded rules from NullPropagation.
c4eea67 [Takuya UESHIN] Fix toString methods.
23e2363 [Takuya UESHIN] Make CaseConversionExpressions foldable if the child is foldable.
0ff7568 [Takuya UESHIN] Add tests for collapsing case statements.
3977d80 [Takuya UESHIN] Add optimization for CaseConversionExpression's.
wangyum added a commit that referenced this pull request May 26, 2023
* LimitPushDownThroughWindow

* fix

* fix
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants