-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29518][SQL][TEST] Benchmark date_part for INTERVAL
#26175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #112316 has finished for PR 26175 at commit
|
| case "extract" => s"EXTRACT($field FROM ${castExpr(from)})" | ||
| case "date_part" => s"DATE_PART('$field', ${castExpr(from)})" | ||
| case "extract" => s"EXTRACT($field FROM ${castExpr(from)}) AS $field" | ||
| case "date_part" => s"DATE_PART('$field', ${castExpr(from)}) AS $field" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why do we need to add alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I debugged this, I showed dataframes to terminal. And printed tables were so wide.
| "QUARTER", "MONTH", "DAY", | ||
| "HOUR", "MINUTE", "SECOND", | ||
| "MILLISECONDS", "MICROSECONDS", "EPOCH") | ||
| val settings = Map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a big deal but Settings seems only used within runBenchmarkSuite. I think it's fine to just make this pretty with indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? Could you clarify, please. Do you want to replace Settings by let's say tuples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one?
for {
(dataType, (fields, funcs, iterNum)) <- Map(
"timestamp" -> (datetimeFields, Seq("extract", "date_part"), N),
"date" -> (datetimeFields, Seq("extract", "date_part"), N),
"interval" -> (intervalFields, Seq("date_part"), N))
func <- funcs} {There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, I was thinking like that. I don't mind if you prefer the current way. no biggie.
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine otherwise.
|
Merged to master. |
What changes were proposed in this pull request?
I extended
ExtractBenchmarkto support theINTERVALtype of thesourceparameter of thedate_partfunction.Why are the changes needed?
date_partfunction in the future.date_partfor theINTERVALtypeDoes this PR introduce any user-facing change?
No
How was this patch tested?
By running the benchmark and print out produced values per each
fieldvalue.