-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC Allow a loop cycle after every message in the batch #5443
base: main
Are you sure you want to change the base?
Conversation
This is good as it brings up a lot of cruft. e.g. one of the tests that failed, while not any(f.key in s.tasks for f in futures):
await asyncio.sleep(0.001)
assert not any(s.tasks[f.key] in steal.key_stealable for f in futures) is currently waiting for at least one task to appear in the scheduler and then expecting all tasks to be there. |
ba75e7c
to
4c078a3
Compare
Unit Test Results 10 files - 2 10 suites - 2 5h 54m 10s ⏱️ - 1h 21m 6s For more details on these failures, see this check. Results for commit 8fd610c. ± Comparison against base commit b0dd9db. ♻️ This comment has been updated with latest results. |
4c078a3
to
9da4c46
Compare
The following two tests seem to consistently fail on this branch
|
9da4c46
to
8fd610c
Compare
Had a look at the failure of distributed/distributed/scheduler.py Lines 7926 to 7949 in 8a99ac7
This may impact all tests related to
so… basically this may impact everything. |
I stumbled upon this thing and again realized how incredibly complex our system is. For those not familiar, asyncio treats a sleep 0 as a sentinel and skips exactly one loop iteration.
This sleep indentation is interesting because the state on master allows an event loop iteration after an entire batch of messages has been worked off. If I move the indentation and therefore allow a loop iteration after every message, a lot of our tests break. I don't think any of our tests should be as sensitive as to react to such a subtle change.
I'm wondering if looking into this would help us with overall CI stability
That's also interesting since we're treating this quite differently for couroutine functions and ordinary functions (see a few lines above)
FWIW I tried aligning the behaviour of these two cases a while ago in #4734 but failed to stabilize the branch. Back then I ran into many of our flaky tests while cleaning this up. FWIW, in the branch over there, I removed the sleep entirely
cc @crusaderky @jcrist