-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-8935][SQL] Implement code generation for all casts #7365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Not ready to be reviewed, just want to trigger test first. |
|
Jenkins, OK to test |
|
@YijieSHEN Just want to mention that you can easily test it locally by: |
|
Test build #1061 has finished for PR 7365 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not remove this and put it into genCode directly?
|
@davies , thanks for the tip, it's just what I need to check any unconscious situations. :) @davies @rxin, the pr is almost done for the current implementation, I introduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It imitate the behaviour of defineCodeGen and nullSafeCodeGen, hold the calculation for either genCode if it's a primitive type cast or to be later used in ComplexType's member evaluation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems a bit too much here to be honest
|
|
|
@cloud-fan Take a look at this also if you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, change this by accident, will revert in following commit.
|
@cloud-fan , thanks for reviewing, will resolve the comment soon. |
2edc21c to
5de0a95
Compare
|
Jenkins, ok to test. |
|
I'm pretty sure Jenkins hates you. |
|
Jenkins, ok to test. |
|
Jenkins, test this please. |
|
Test build #37752 has finished for PR 7365 at commit
|
|
Test build #37773 has finished for PR 7365 at commit
|
|
Test build #37778 has finished for PR 7365 at commit
|
|
Test build #37784 has finished for PR 7365 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizer will remove unnecessary casts, so I think we don't need this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, let me check why optimizer didn't work for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this, checkEvaluation in testSuite doesn't trigger optimiser at all, does it? So I think I should remove unnecessary cast introduced in DateExpressionSuite, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the fact that we also do cast in interpreted version when from == to, let's do this in codegen version too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess its reasonable to remove unnecessary cast introduced in DateExpressionSuite and remove the interpreted version as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to keep them. I don't really have a strong preference on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cast expression will be inserted during physical operator transformation,for example Average, in which case its possible we add a cast to self expression.
|
Test build #37800 has finished for PR 7365 at commit
|
|
More comments on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can use result as GeneratedExpressionCode for resultPrim, resultNull, child for childPrim and childNull. resultType is this.dataType.
Right now, it has two many parameters, hard to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is short, I'd like to inline it into genCode(), it will be easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
castCode is intendedly take all these parameters because we have to do cast for ComplexType as well, so we need call castCode recursively in these casts.
I think I should add a comment in code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used by castStructCode, never mind the above comments.
|
Test build #38013 has finished for PR 7365 at commit
|
|
It's a |
|
@davies , could you review this again? thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use should use the special getter to access them, because UnsafeRow does not support generic getter for primitive types, see ctx.getColumn, then we don't need to unboxPrimitive()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I get this, I will use ctx.getColumn and ctx.setColumn in new commit.
|
Test build #1153 has finished for PR 7365 at commit
|
|
Test build #38067 has finished for PR 7365 at commit
|
|
build error caused by : |
|
Test build #1183 has finished for PR 7365 at commit
|
|
LGTM |
|
Merging into master, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason that we create fresh name for these temp variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose we are casting a structType that has two arrayType field, all the structType's filed will be expand one by one, therefore, we need fresh name for each of the arrayField cast to avoid redeclaring variables in the same scope.
JIRA: https://issues.apache.org/jira/browse/SPARK-8935