You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Worker.data_needed is a list of tasks that need data to be fetched. Currently it's just FIFO by task submission time. So in theory, if a higher-priority task gets submitted after a lower-priority one, the low-priority one will get its data fetched first, and therefore probably run first.
I'm not sure in reality if this is much of an issue. We tend to see the biggest priority-vs-FIFO-ordering issues with root tasks; since they don't have dependencies, they're not relevant here. But it does just feel odd. The question is how often tasks with dependencies get submitted to workers out of priority order.
Switching this to a priority heap would be pretty easy and probably pretty cheap. There's even a TODO for it:
Worker.data_needed
is a list of tasks that need data to be fetched. Currently it's just FIFO by task submission time. So in theory, if a higher-priority task gets submitted after a lower-priority one, the low-priority one will get its data fetched first, and therefore probably run first.I'm not sure in reality if this is much of an issue. We tend to see the biggest priority-vs-FIFO-ordering issues with root tasks; since they don't have dependencies, they're not relevant here. But it does just feel odd. The question is how often tasks with dependencies get submitted to workers out of priority order.
Switching this to a priority heap would be pretty easy and probably pretty cheap. There's even a TODO for it:
distributed/distributed/worker.py
Line 437 in 3f86e58
The text was updated successfully, but these errors were encountered: