-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22591][SQL] GenerateOrdering shouldn't change CodegenContext.INPUT_ROW #19800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @cloud-fan @kiszk |
| } | ||
| } | ||
|
|
||
| test("SPARK-22591: GenerateOrdering shouldn't change ctx.INPUT_ROW") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it cause any bugs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working to support wholestage codegen when generating expression codes safe from 64k limit. When there is not INPUT_ROW in context but we wrongly set a INPUT_ROW value, a non-existing InternalRow i will be added into function parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we use INPUT_ROW a lot in codegen framework, it is risky to change its value without restoring it.
| def genComparisons(ctx: CodegenContext, ordering: Seq[SortOrder]): String = { | ||
| val oldInputRow = ctx.INPUT_ROW | ||
| val comparisons = ordering.map { order => | ||
| val oldCurrentVars = ctx.currentVars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we can also move this our of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about
val oldInputRow = ...
val oldCurrentVars = ...
val inputRow = "i"
ctx.INPUT_ROW = inputRow
ctx.currentVars = null
val comparisons = ...
...
ctx.INPUT_ROW = oldInputRow
ctx.currentVars = oldCurrentVars
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Looks good.
|
Test build #84128 has finished for PR 19800 at commit
|
|
This test |
|
Test build #84131 has finished for PR 19800 at commit
|
| // Restore original currentVars and INPUT_ROW. | ||
| ctx.currentVars = oldCurrentVars | ||
| ctx.INPUT_ROW = oldInputRow | ||
| finalCode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx.currentVars = oldCurrentVars
ctx.INPUT_ROW = oldInputRow
s"""
|InternalRow $inputRow = null;
|$code
""".stripMargin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks.
|
I created https://issues.apache.org/jira/browse/SPARK-22595 to track the flaky test |
|
Test build #84144 has finished for PR 19800 at commit
|
|
thanks, merging to master/2.2! |
What changes were proposed in this pull request?
When I played with codegen in developing another PR, I found the value of
CodegenContext.INPUT_ROWis not reliable. Under wholestage codegen, it is assigned to null first and then suddenly changed toi.The reason is
GenerateOrderingchangesCodegenContext.INPUT_ROWbut doesn't restore it back.How was this patch tested?
Added test.