-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34211][SQL][TESTS] Benchmark TPC-DS with 1GB scale factor #31303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #134395 has finished for PR 31303 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #134397 has finished for PR 31303 at commit
|
| run: cd spark-sql-perf && build/sbt "test:runMain com.databricks.spark.sql.GenTPCDSData `pwd`/../tpcds-kit/tools 1 `pwd`/../tpcds1g parquet" | ||
| - name: Run TPCDSQueryBenchmark | ||
| run: | | ||
| SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.TPCDSQueryBenchmark --data-location `pwd`/tpcds1g --cbo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we enable CBO by default? I think we should run it with the default conf.
|
cc @Ngone51, @dongjoon-hyun, @cloud-fan, @peter-toth FYI |
| - name: Checkout spark-sql-perf repository | ||
| uses: actions/checkout@v2 | ||
| with: | ||
| repository: wangyum/spark-sql-perf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe what these repos do in the PR description? Also what does this fork do? Looks like only diff is wangyum/spark-sql-perf@abf08eb. Can you just use the original repo, and keep the data generate Scala file somewhere in Spark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Original repo need a class: databricks/spark-sql-perf#196
| run: cd spark-sql-perf && build/sbt "test:runMain com.databricks.spark.sql.GenTPCDSData `pwd`/../tpcds-kit/tools 1 `pwd`/../tpcds1g parquet" | ||
| - name: Run TPCDSQueryBenchmark | ||
| run: | | ||
| SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.TPCDSQueryBenchmark --data-location `pwd`/tpcds1g --cbo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we keep the cache of Coursier in SBT?
| run: | | ||
| ./build/sbt -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Phadoop-2.7 compile test:compile | ||
| tpcds1g: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no big deal but I would name it tpcds-1g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few issues, @wangyum and @HyukjinKwon .
- I'm not sure if you are aware of it, but GitHub Action job performance is inconsisent. I'm -1 for depending on the unknown runner's benchmark.
- Apache Infra team already asked us to manage the shared resource pool efficiently. Given that, we had better avoid
performance benchmarkin Apache Spark GitHub Action job.
Like @maropu did already, why don't we have this in the downstream?
|
Is the perf inconsistency big? It will give you a rough numbers to check. I was also thinking this as a test. I remember we faced a case that TPSDS was broken whereas the Spark tests pass. The resource is indeed a problem. It has been a problem so far. I think we have added some jobs if they are worthwhile though. I'm okay to wait until we have some more resources and don't add new jobs for the time being. |
|
Actually, I understand the requirement and the intention because I also reported SPARK-33822 (TPCDS Q5 fails if spark.sql.adaptive.enabled=true).
@wangyum Could you provide a result in your repository which is requested by @HyukjinKwon ? It depends on the characteristic of the job.
BTW, on top of the above issues, this is not safe in terms of security. Recently, Apache Infra team introduced strict security enforcements by banning the runnable source of GitHub Action. We have no available option when the 3rd party repositories are compromised . Even the docker image repo, we should migrate |
|
Yeah, it would be great if we can tell if the perf diff is considerably big, @wangyum. If it's too big, it will be less making sense to add. And, yes it would be great if there are ways around to use them. If I remember correctly, the purpose of banning GitHub Actions and 3rd part repository was to avoid running unreviewed codes #31104 (comment). |
|
Latest benchmark result:
|
|
I agree that we can still track some big perf differences (like 10x slower) with Github actions, but running benchmarks in Github action takes a lot of resources and may not pay off if it can only catch big perf differences that are rare. We can either fork spark and run benchmarks with Github actions under your own account, or set up AWS machines to run benchmarks like @maropu did. |
|
Close it. Thank you all. |
Just be correct, doing 1GB is not a lot. It takes less than an hour which is same as other parts of our tests. As I said earlier, it's not only performance but also test, see SPARK-33822. But it's fine to drop it given that the resource issue going on globally in ASF repos. |
|
Once we resolve the resource limitation issue, it would be great if we can bring this back. |
|
Thank you for the decision all! |
What changes were proposed in this pull request?
This pr add a new Github action to benchmark TPC-DS with 1GB scale factor.
Why are the changes needed?
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Spark-3-1-1-RC1-tt30579.html#a30603
https://issues.apache.org/jira/browse/SPARK-26346
Does this PR introduce any user-facing change?
No.
How was this patch tested?
N/A.