-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow unknown tasks to be stolen #5572
Conversation
0ddda3e
to
03e2111
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Having start and stop is nice!
|
||
steal = s.extensions["stealing"] | ||
await steal.stop() | ||
lock = await Semaphore(max_leases=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a Semaphore instead of a Lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had issues with the lock and have had some in the past. It somehow didn't lock. I may have forgotten to await something, idk. fwiw, I think we should throw lock away and replace it with the semaphore implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock = lambda: Semaphore(max_leases=1)
seems reasonable to me. Nice to have fewer codepaths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do something like the following but that's a matter of taste :)
class Lock(Semaphore):
def __init__(*args, **kwargs):
super().__init__(*args, **kwargs, max_leases=1)
The semaphore also implements lease timeouts such that it handles dying workers gracefully by releasing the lock.
IIRC, the reason why I didn't do this from the beginning is that there are subtly differences in API causing the Lock API to break. For instance, you need to await a semaphore object before using it while you don't need to do this on a lock, etc. Also a lot of the tests are specific about the extension. However, we might probable just be able to delete all Lock specific tests since the semaphore should cover it anyhow.
|
This effectively reverts #5392 which enforced that unknown tasks couldn't be stolen. This was flagged as a regression since a test asserting this behaviour was introduced in very early stages of the stealing development #278. Back then we didn't have any intermediate runtime information about the task but this is no longer true and this behaviour can be revised.
For testability, I added
WorkStealing.start
andWorkStealing.stop
since I believe for proper timing based tests we should have the possibility to disable the background PC. Otherwise it is very hard to assert specifically what's happening. If preferred, I can factor this out into a dedicated PR