-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-27460][TESTS][2.4] Running slowest test suites in their own forked JVMs for higher parallelism #25861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…JVMs for higher parallelism This patch modifies SparkBuild so that the largest / slowest test suites (or collections of suites) can run in their own forked JVMs, allowing them to be run in parallel with each other. This opt-in / whitelisting approach allows us to increase parallelism without having to fix a long-tail of flakiness / brittleness issues in tests which aren't performance bottlenecks. See comments in SparkBuild.scala for information on the details, including a summary of why we sometimes opt to run entire groups of tests in a single forked JVM . The time of full new pull request test in Jenkins is reduced by around 53%: before changes: 4hr 40min after changes: 2hr 13min Unit test Closes #24373 from gengliangwang/parallelTest. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
cc @gengliangwang , @srowen , @cloud-fan , @gatorsmile , @HyukjinKwon . |
|
I'll add the follow-up PR, too. BTW, @gatorsmile , do we need cd4a284 ?
|
…t suite list The test time of `HiveClientVersions` is around 3.5 minutes. This PR is to add it into the parallel test suite list. To make sure there is no colliding warehouse location, we can change the warehouse path to a temporary directory. Unit test Closes #24404 from gengliangwang/parallelTestFollowUp. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This patch makes several test flakiness fixes. N/A Closes #24434 from gatorsmile/fixFlakyTest. Lead-authored-by: gatorsmile <gatorsmile@gmail.com> Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
Looks fine to me. |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK pending tests.
|
Thank you for review, @HyukjinKwon and @srowen ! |
|
The last Jenkins run already passed all Java/Scala tests and now is running PySpark test. |
|
Test build #111030 has finished for PR 25861 at commit
|
|
Test build #111031 has finished for PR 25861 at commit
|
|
Test build #111032 has finished for PR 25861 at commit
|
|
Merged to |
…rked JVMs for higher parallelism ## What changes were proposed in this pull request? This is a backport of #24373 , #24404 and #24434 This patch modifies SparkBuild so that the largest / slowest test suites (or collections of suites) can run in their own forked JVMs, allowing them to be run in parallel with each other. This opt-in / whitelisting approach allows us to increase parallelism without having to fix a long-tail of flakiness / brittleness issues in tests which aren't performance bottlenecks. See comments in SparkBuild.scala for information on the details, including a summary of why we sometimes opt to run entire groups of tests in a single forked JVM . The time of full new pull request test in Jenkins is reduced by around 53%: before changes: 4hr 40min after changes: 2hr 13min ## How was this patch tested? Unit test Closes #25861 from dongjoon-hyun/SPARK-27460. Lead-authored-by: Gengliang Wang <gengliang.wang@databricks.com> Co-authored-by: gatorsmile <gatorsmile@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
This is a backport of #24373 , #24404 and #24434
This patch modifies SparkBuild so that the largest / slowest test suites (or collections of suites) can run in their own forked JVMs, allowing them to be run in parallel with each other. This opt-in / whitelisting approach allows us to increase parallelism without having to fix a long-tail of flakiness / brittleness issues in tests which aren't performance bottlenecks.
See comments in SparkBuild.scala for information on the details, including a summary of why we sometimes opt to run entire groups of tests in a single forked JVM .
The time of full new pull request test in Jenkins is reduced by around 53%:
before changes: 4hr 40min
after changes: 2hr 13min
How was this patch tested?
Unit test