Increase latency overhead in stealing cost calculation #5390

fjetter · 2021-10-05T13:53:09Z

In my latest dive into stealing code I investigated some of the logs and saw a lot of ridiculous steal requests. Task durations of ~10ms and occupancy differences between thief and victim of ~100ms.

Not only do we not care for such a difference but the act of stealing is guaranteed to be more expensive than letting things be.

Stealing requires at least three network bounces (steal-request, steal-confirm, compute-task) which includes code serialization if successful. It almost impossible to do this in the currently hard coded 1ms. The 100ms I propose are likely too conservative but I don't think this is necessarily a bad thing for stealing. I don't have time for large scale tests but am very confident that this should by much higher than it is right now. Thoughts, concerns?

cc @gjoseph92 @crusaderky

fjetter · 2021-10-05T13:55:22Z

fwiw, I don't even consider it worth it to measure this properly. We are working with so many estimations in the stealing code that an accurate measurement of this offset is not worth it imho

gjoseph92 · 2021-10-05T18:59:36Z

Frankly 0.1s doesn't even seem that conservative to me.

Increase latency for stealing

31e26a1

fjetter changed the title ~~Increase latency for stealing~~ Increase latency overhead in stealing cost calculation Oct 6, 2021

fjetter mentioned this pull request Oct 8, 2021

Fix a race condition which would allow a rescheduled task to be reported missing even though it is not #5160

Merged

crusaderky approved these changes Oct 19, 2021

View reviewed changes

crusaderky merged commit a8151a6 into dask:main Oct 19, 2021

zanieb pushed a commit to zanieb/distributed that referenced this pull request Oct 28, 2021

Increase latency for stealing (dask#5390)

f38101b

fjetter mentioned this pull request Apr 13, 2022

Allow stealing of fast tasks in some situations #6115

Open

fjetter added the stealing label Jun 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase latency overhead in stealing cost calculation #5390

Increase latency overhead in stealing cost calculation #5390

fjetter commented Oct 5, 2021

fjetter commented Oct 5, 2021

gjoseph92 commented Oct 5, 2021

Increase latency overhead in stealing cost calculation #5390

Increase latency overhead in stealing cost calculation #5390

Conversation

fjetter commented Oct 5, 2021

fjetter commented Oct 5, 2021

gjoseph92 commented Oct 5, 2021