Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload not balancing during scale up on dask-gateway #5599

Open
chrisroat opened this issue Dec 14, 2021 · 6 comments
Open

workload not balancing during scale up on dask-gateway #5599

chrisroat opened this issue Dec 14, 2021 · 6 comments
Labels
adaptive All things relating to adaptive scaling needs info Needs further information from the user stealing

Comments

@chrisroat
Copy link
Contributor

What happened:

I have a fully parallel workload of 30 tasks (no dependencies), each of which takes ~20 minutes. My cluster autoscales between 10 and 50 workers. When I start the task graph, I find the 30 jobs distributed 3 a-piece on each initial worker. Eventually, the cluster scales to 30 workers -- but sometimes the tasks will not redistribute.

Even after jobs finish and timing would be known, I have seen a situation where 7 remaining jobs distributed 2-2-2-1 on 4 workers, while there are plenty empty workers.

What you expected to happen:

As new workers come online, they steal tasks from existing workers.

Minimal Complete Verifiable Example:

I tried to replicate on a local cluster, but could not. Things seem to work as expected, and tasks are immediately redistributed.

Anything else we need to know?:

I'm attaching debug schedule/worker logs per @fjetter 's script, at the point where 30 workers are available and 10 workers each have 3 tasks.

Environment:

  • Dask version: 2021.12.0
  • Python version: 3.8.12
  • Operating System: Ubuntu 20
  • Install method (conda, pip, source): conda

scheduler_20211214.pkl.gz

worker_20211214.pkl.gz

@chrisroat
Copy link
Contributor Author

My workers and tasks do have resource constraints. In this case, I have 6 CPUs and 45 GB of memory -- and this particular task requests 45 GB. Would the recent fix 96d4fd4 potentially solve this problem?

@chrisroat
Copy link
Contributor Author

I pushed a docker image to our cluster with 96d4fd4 and it didn't seem to make a difference.

@gjoseph92
Copy link
Collaborator

@chrisroat this sounds very similar to #5564 to me, which was recently fixed but not released yet. Can you try on main or with #5572?

@gjoseph92 gjoseph92 added the needs info Needs further information from the user label Dec 16, 2021
@jrbourbeau
Copy link
Member

Agreed with @gjoseph92's point that this seems similar to #5564. FWIW #5572 was included in the distributed==2021.12.0 release, so you should be able to test things out by updating to that release too

@chrisroat
Copy link
Contributor Author

I reported this at 2021.12.0, and even later at 96d4fd4. In both cases, I'm seeing this behavior. I've attached the scheduler/worker dumps if that helps understand what is happening.

@chrisroat
Copy link
Contributor Author

Could the needs_info label be removed? I reported this at the release requested. I am starting to try and understand these large cluster dump files, and would appreciate any tips.

I am running at 2022.1.0 now, and I'm attaching more cluster dump logs. In this case, the workload doesn't fully balance, even without autoscaling.

  • it's a 3d map_overlap, so has lots of short upstream re-arranging tasks
  • stack-trim-getitem are the 680 fused tasks containing the long per-chunk calculation
  • I set the config default duration for stack-item-getitem to 500s, which in my local scheduler tests helps mitigate this issue
  • cluster has 50+ workers + 15 threads = 750+ threads
  • Not all 680 tasks are even ready -- some are stuck behind unfinished upstream re-arranging tasks
  • Only 12-13 threads being used
  • 1 task has lots of the "pre-overlap" stitching tasks

If there is even some patch/hack I can make in a personal fork, it would be acceptable.

dump_765b081abd78484495e033a613140c02_20220122.msgpack.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adaptive All things relating to adaptive scaling needs info Needs further information from the user stealing
Projects
None yet
Development

No branches or pull requests

4 participants