Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoadScope scheduler: Sort scopes by number of tests to assign biggest scopes first #778

Merged

Conversation

cryvate
Copy link
Contributor

@cryvate cryvate commented Apr 21, 2022

Based upon this PR/branch, OP text quoted here:

In the real world, test scopes aren't equal: Some have just a few tests which take little time to run while others can have a large amount of tests taking longer to finish.
With the current implementation it can happen that the scheduler assigns a scope with a large amount of tests very late in the run, often we see a single worker still running tests of a large scope while all other workers are idle because all other scopes have finished running.

This PR inserts scopes into the workqueue sorted by the amount of tests. The idea is to begin testing with the biggest scopes so that small scopes can essentially be bin packed late in the run so that all workers finish processing roughly at the same time.

I am aware that the number of tests is not the best indicator for test runtime but it is better than random scope assignment.
We have developed a more sophisticated version sorting with timings from previous test runs but unfortunately it requires external tooling.

I have one reservation about this change and backwards compatibility when we have test_a.py and test_b.py:

  • in the current release, no test from test_b.py is run before any tests from test_a.py in any runner
  • after this patch, tests from test_b.py might run before any tests from test_a.py in a runner (if test_b.py contains more tests)

I don't know what the general pytest(-plugin)'s view and pytest-xdist's view in particular is on backwards compatibility in the face of, essentially, flaky/broken tests.

Long term I wonder whether we would want to expose the ordering of this queue somehow: it would be cool when e.g. running in CircleCI (which can seed based on previous running times) to be able to repeat the same within the runners, especially as they could have many cores/parallel runners and so there can be inefficiencies.

@joekohlsdorf
Copy link

joekohlsdorf commented Feb 17, 2023

Long term I wonder whether we would want to expose the ordering of this queue somehow: it would be cool when e.g. running in CircleCI (which can seed based on previous running times) to be able to repeat the same within the runners, especially as they could have many cores/parallel runners and so there can be inefficiencies.

That is exactly what I do in the "more sophisticated" approach I was talking about in the original PR. We sort scopes by previous runtime and then distribute among xdist workers according to this data. But I think it will be hard to get something like this into xdist because it depends a lot on your environment. In an ideal world xdist should provide an official extension mechanism which lets you hook in your own code easily, similar to how pytest itself is extensible.
I only submitted a very light version of this in the PR which should still be helpful for most people.

We had to do a huge monkey patch to get this done but the actual custom code is <50 lines.

FYI be careful with the CircleCI test splitter, my experience is that it doesn't correctly take timings into account even though it says so.

Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cryvate and @joekohlsdorf for the contribution. 👍

I will make a new release once this is merged.

@nicoddemus nicoddemus force-pushed the perf/sort_scopes_by_number_of_tests branch from 113a197 to 7bd858e Compare November 21, 2023 14:26
@nicoddemus nicoddemus merged commit 3fe877b into pytest-dev:master Nov 21, 2023
22 checks passed
@cryvate
Copy link
Contributor Author

cryvate commented Nov 21, 2023

Thanks @nicoddemus!

ericpre added a commit to ericpre/hyperspy that referenced this pull request Nov 25, 2023
…v/pytest-xdist#778, even if at first this appear to be unrelated (we use `loadfile` and not `loadscope`)
ericpre added a commit to ericpre/hyperspy that referenced this pull request Nov 25, 2023
…v/pytest-xdist#778, even if at first this appear to be unrelated (we use `loadfile` and not `loadscope`)
ericpre added a commit to ericpre/hyperspy that referenced this pull request Nov 29, 2023
…v/pytest-xdist#778, even if at first this appear to be unrelated (we use `loadfile` and not `loadscope`)
ericpre added a commit to ericpre/hyperspy that referenced this pull request Nov 29, 2023
…v/pytest-xdist#778, even if at first this appear to be unrelated (we use `loadfile` and not `loadscope`)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants