-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Update Worker._executing
handling
#5500
Conversation
Should we try to get this into the release if it does pass CI? |
I'm curious if we could consolidate management of One issue is that when going to executing, there’s often a But as far as removing it, why would you ever want to leave the task in (Note that trying variations of these changes locally, many tests seem to deadlock, so I'm sure there's a reason we don't do this—I'd just like to understand why.) |
I'd feel more comfortable getting @fjetter's eyes on this before releasing it. I think this is okay, but my knowledge of the worker state machine is also not as up to date as it used to be pre-refactor |
An alternative to dask#5500 for solving dask#5497. See dask#5500 (comment) for details.
Ok. Happy to leave this out if you prefer. Just figured if it helps solve a regression and is relatively simple, it might be worth putting in. If we don't think it is that simple (or there are subtle points that need further exploration), understand leaving it out. |
Change looks OK. I am typically trying to avoid generic catch-all situations, e.g. I prefer a I don't want this to be merged without a test. This situation looks as if easily provokable in a test, I'll look into it.
You don't. It should always be removed but I forgot that transition path. That's basically a philosophical question as outlined above. In my refactoring I tended to be very strict about these manners since I stumbled over so much dead code and undefined behaviour because a task was None, an attribute may be set, etc. I prefer spelling this all out explicitly, e.g. there should now be no situations possible where
I believe you answered your own question there. I've been arround that block a few times. I dislike our entire interaction with the threadpool to be honest since it is very entangled with the state machine and not properly encapsulated. We have the same problem with data fetching coroutines, btw. we end up setting some state before we dispatch a coroutine and then need to put a lot of engineering around ensuring a proper state before/after. That's one of the primary drivers for the resumed/cancelled state mechnanics (Not the only one) and a very frequent source of past deadlocks. I'm very open to refactoring this part of the code as well. However, I would suggest starting with removing |
This is a possible fix for #5497. With this change the example in #5497 no longer deadlocks. I'm pushing changes up early to test against the full CI suite
pre-commit run --all-files
cc @gjoseph92 @fjetter