-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25488][SQL][TEST] Refactor MiscBenchmark to use main method #22500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| (i.toString, (i + 1).toString, (i + 2).toString, (i + 3).toString) | ||
| })))).toDF("col", "arr") | ||
|
|
||
| df.selectExpr("*", "explode(arr) as arr_col") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name should be explode.
| generate stack: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------ | ||
| generate stack wholestage off 17179 / 17719 1.0 1024.0 1.0X | ||
| generate stack wholestage on 13674 / 14112 1.2 815.0 1.3X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do some deep dive, which was 15.5X before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance declines because of this commit: Disable generate codegen since it fails my workload .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin Do we have plan to enable generate codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this is not related to this refactoring, ping @rxin and @kiszk because @kiszk seemed to want to fix the root cause of the failure.
@rxin which operation in Generator makes failure of your workload? commit comment.
|
Test build #96368 has finished for PR 22500 at commit
|
# Conflicts: # sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/MiscBenchmark.scala
|
Test build #96820 has finished for PR 22500 at commit
|
|
Retest this please. |
|
Test build #96863 has finished for PR 22500 at commit
|
| */ | ||
| class MiscBenchmark extends BenchmarkWithCodegen { | ||
|
|
||
| ignore("filter & aggregate without group") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactoring introduces a long function body at runBenchmarkSuite. In general, it's not a better direction.
Could you map each ignore function into an independent function and make runBenchmarkSuite() invoke a series of those functions?
|
Test build #96877 has finished for PR 22500 at commit
|
|
Test build #96882 has finished for PR 22500 at commit
|
|
retest this please |
|
Test build #96888 has finished for PR 22500 at commit
|
|
retest this please |
|
Test build #96889 has finished for PR 22500 at commit
|
|
Retest this please. |
|
Test build #96903 has finished for PR 22500 at commit
|
|
Retest this please. |
|
Test build #96910 has finished for PR 22500 at commit
|
|
@dongjoon-hyun Is this ready to go? |
|
Hi, @wangyum .
|
|
Test build #97035 has finished for PR 22500 at commit
|
|
Test build #97032 has finished for PR 22500 at commit
|
|
retest this please |
|
Test build #97036 has finished for PR 22500 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @wangyum and @dilipbiswal .
Merged to master.
## What changes were proposed in this pull request? Refactor `MiscBenchmark ` to use main method. Generate benchmark result: ```sh SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.MiscBenchmark" ``` ## How was this patch tested? manual tests Closes apache#22500 from wangyum/SPARK-25488. Lead-authored-by: Yuming Wang <yumwang@ebay.com> Co-authored-by: Yuming Wang <wgyumg@gmail.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Refactor
MiscBenchmarkto use main method.Generate benchmark result:
SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.MiscBenchmark"How was this patch tested?
manual tests