Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-82054: allow test runner to split test_asyncio #103859

Closed
wants to merge 0 commits into from

Conversation

zitterbewegung
Copy link
Contributor

@zitterbewegung zitterbewegung commented Apr 25, 2023

Summary:

This runs test_asyncio sub-tests in parallel using sharding by cinder. These two tests are typically the long-poles in runs because they are modules with a lot of further sub-tests run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.

After porting we can see the direct impact is extremely large (15% increase in performance):

  • Without this change:
    • Running make test is 5 min 26 sec
  • With this change:
    • Running make test takes 3 min and 45 seconds

The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the time saved it is worth it . It's not a complicated change and I think the win in productivity with the change above is significant.

@zitterbewegung zitterbewegung marked this pull request as draft April 25, 2023 23:18
@zitterbewegung zitterbewegung marked this pull request as ready for review April 25, 2023 23:24
@zitterbewegung zitterbewegung changed the title gh-82054: Porting execution of tests in parallel by sharding (note this only is for asyncio and compiler) gh-82054: Porting parallel test execution from cinder by sharding (note this only is for asyncio and compiler) Apr 25, 2023
Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a couple comments.

Lib/test/libregrtest/runtest.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@zitterbewegung zitterbewegung changed the title gh-82054: Porting parallel test execution from cinder by sharding (note this only is for asyncio and compiler) gh-82054: Porting parallel test execution from cinder by sharding (note this only is for asyncio) Apr 26, 2023
@carljm carljm changed the title gh-82054: Porting parallel test execution from cinder by sharding (note this only is for asyncio) gh-82054: allow test runner to split test_asyncio for better parallel execution Apr 26, 2023
@carljm
Copy link
Member

carljm commented Apr 26, 2023

@zitterbewegung in order to make the case for landing this, it would be useful to update the PR description with some current data. Copying the description from the Cinder diff isn't really useful, since it's from a much older version. What would be helpful is to run make test with and without this change, and report any improvement in overall runtime.

It also may be that test_asyncio is no longer the longest pole, and we may need to apply the same treatment to some other modules (test_multiprocessing?) in order to see a significant benefit in overall runtime of make test.

@zitterbewegung
Copy link
Contributor Author

zitterbewegung commented Apr 26, 2023

@carljm I can see an overall benefit of 100 seconds from running cpython's baseline. I have attached both runs.
To give explicit timings before this change running make test is 5 min 26 sec and with the change it goes to 3 min and 46 seconds (100 seconds saved). Adding multiprocessing gives us a 5 second improvement.

This was executed on a AMD Ryzen Threadripper 3970X 32-Core Processor with 128GB of ram.

Patched_run_of_tests.txt
original_time_cpython.txt

@carljm
Copy link
Member

carljm commented Apr 26, 2023

@zitterbewegung Awesome, those are great results! Can you directly update the PR description/summary to eliminate the text copied from the Cinder diff and instead just summarize these results?

@zitterbewegung
Copy link
Contributor Author

@zitterbewegung Awesome, those are great results! Can you directly update the PR description/summary to eliminate the text copied from the Cinder diff and instead just summarize these results?

Updated summary with new results.

@carljm
Copy link
Member

carljm commented Apr 26, 2023

@zitterbewegung Ok, one last thing I see is that make patchcheck is failing on the Azure Pipelines CI job, due to something it doesn't like about the whitespace in runtests.py. Pretty nitpicky, but we do want to keep the CI green; can you run make patchcheck locally and try to adjust the diff such that patchcheck is happy?

@zitterbewegung
Copy link
Contributor Author

@zitterbewegung Awesome, those are great results! Can you directly update the PR description/summary to eliminate the text copied from the Cinder diff and instead just summarize these results?

Done.

@zitterbewegung
Copy link
Contributor Author

zitterbewegung commented Apr 27, 2023

@carljm I accidentally closed this PR while I was adding test_multiprocessing which saved 6 more seconds the new PR is #103927

@zitterbewegung zitterbewegung changed the title gh-82054: allow test runner to split test_asyncio for better parallel execution gh-82054: allow test runner to split test_asyncio and test_multiprocessing for better parallel execution Apr 27, 2023
@zitterbewegung zitterbewegung changed the title gh-82054: allow test runner to split test_asyncio and test_multiprocessing for better parallel execution gh-82054: allow test runner to split test_asyncio Apr 27, 2023
@zitterbewegung
Copy link
Contributor Author

zitterbewegung commented Apr 27, 2023

@zitterbewegung Ok, one last thing I see is that make patchcheck is failing on the Azure Pipelines CI job, due to something it doesn't like about the whitespace in runtests.py. Pretty nitpicky, but we do want to keep the CI green; can you run make patchcheck locally and try to adjust the diff such that patchcheck is happy?

I ran make patch check after I did the changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants