-
-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace test_(do_not_)steal_communication_heavy_tasks
tests with more robust versions
#7243
Replace test_(do_not_)steal_communication_heavy_tasks
tests with more robust versions
#7243
Conversation
…o_not_steal_communication_heavy_tasks
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ± 0 15 suites ±0 6h 19m 22s ⏱️ - 17m 42s For more details on these failures, see this check. Results for commit 2510468. ± Comparison against base commit aa1c6d8. ♻️ This comment has been updated with latest results. |
If they are contradicting tests, why is it flaky? Shouldn't it reliably fail? This still sounds like there is a bug somewhere |
The entire test makes fairly little sense at the moment. I think it should have been dropped in #7075 and yet I missed it. We could spend time cleaning up the test, but the way it is written right now, I think it is prone to a bunch of race conditions and does not test anything useful. Also, any useful behavior should be tested in other places (e.g. |
distributed/tests/test_steal.py
Outdated
@@ -825,44 +825,6 @@ def block_reduce(x, y, event): | |||
assert not b.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_do_not_steal_communication_heavy_tasks
doesn't look right.
Firstly, there's a race condition. The test waits for x and y to enter running state on the worker, polling every 100ms.
The first iteration of while not a.state.tasks
will always fail (because we never yielded the event loop since submit).
The second iteration, 100ms later, most times will find that a and b are in memory and two block_reduce are executing. However there's nothing guaranteeing it, so when you call steal.balance()
on the next line you might have x or y still running, or in memory but without the scheduler knowing yet. Unlikely, but possible.
The test implicitly relies on tasks of unknown duration to be stolen (#5572). It should be changed not to rely on this specific use case.
Finally, why bother calling steal.stop()
? If the block_reduce tasks can't be stolen, it should be inconsequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overhauled test in #7250. This PR can be merged as is.
It doesn't fail because it doesn't actually test where the tasks end up being executed. |
test_steal_communication_heavy_tasks
since it contradicts test_d…test_(do_not_)steal_communication_heavy_tasks
tests with more robust versions
I've run it a couple thousand times locally without being able to reproduce the flake. Since the implementation of
You have a good point there, I have dropped that one as well and added three other tests that should cover the functionality that we initially wanted to test here. I like your use of
|
@@ -1446,6 +1446,57 @@ def func(*args): | |||
assert (ntasks_per_worker < ideal * 1.5).all(), (ideal, ntasks_per_worker) | |||
|
|||
|
|||
def test_balance_steal_communication_heavy_tasks(): | |||
dependencies = {"a": 10, "b": 10} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I change this line to {"a": 1e-6, "b": 1e-6}
the test remains green, and if go lower than that I instead get "ValueError: Expected a value larger than 16 integer but got 10."
Are we sure we're actually testing anything here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test breaks if we double the cost of the dependencies to {"a": 20, "b": 20}
as the tasks have become too expensive to move in this setup. By reducing the cost of the dependencies, we are checking whether we would move tasks that are cheap to move which is the desired behavior. Reducing the cost below 1e-6
creates trouble since we are calculating the actual size of the dependencies as the product of their specified cost and the available bandwidth.
I think I was too hasty in approving.
|
Yikes, looks like merging |
This PR drops flaky
test_steal_communication_heavy_tasks
_ since it contradictstest_do_not_steal_communication_heavy_tasks
and should outdated after the latest changes to work-stealing.pre-commit run --all-files