Recommendation Requested / Observed mismatch #8119

Vesyrak · 2023-08-21T08:36:07Z

When the adaptive core gets a target value that recommends a scale-down, it always appears to take the first worker, as defined here:

distributed/distributed/deploy/adaptive_core.py

Line 192 in 716d526

not_yet_arrived = requested - observed

This is because requested and observed contain completely different data.
requested takes the name of the worker, which is indicated by the clusters as an incremental integer.

distributed/distributed/deploy/spec.py

Line 550 in 716d526

new_worker_name = self._new_worker_name(self._i)

However, the observed names, or the names the scheduler gets from the workers, appear to be the addresses of the workers.
This results in a mismatch as the sets are compared, and as there is no overlap, the adaptive core assumes that it is still awaiting some workers, and thus can kill the not-yet-arrived workers. This is counterproductive, as this causes the adaptive algorithm to kill based on ordering, rather than idle behaviour.

Environment:

Dask version: 2023.3.0
Python version: 3.11
Operating System: Ubuntu 22.04 (docker)
Install method (conda, pip, source): pip

The text was updated successfully, but these errors were encountered:

Vesyrak · 2023-08-21T11:37:05Z

Noticed that this issue probably would propagate to other parts of the code, as seen here:

distributed/distributed/deploy/spec.py

Line 510 in 716d526

not_yet_launched = set(self.worker_spec) - {

Vesyrak · 2023-08-21T11:45:15Z

Highly related issue: #4532

github-actions bot added the needs triage label Aug 21, 2023

Vesyrak linked a pull request Oct 13, 2023 that will close this issue

fix(adaptive): fixed comparison in recommendations when scaling down #8273

Open

Vesyrak mentioned this issue Nov 13, 2023

Ensure adaptive properties work as expected for SpecCluster #8324

Open

Vesyrak mentioned this issue Dec 28, 2023

Distributed core logic maintenance #8432

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommendation Requested / Observed mismatch #8119

Recommendation Requested / Observed mismatch #8119

Vesyrak commented Aug 21, 2023

Vesyrak commented Aug 21, 2023 •

edited

Loading

Vesyrak commented Aug 21, 2023

Recommendation Requested / Observed mismatch #8119

Recommendation Requested / Observed mismatch #8119

Comments

Vesyrak commented Aug 21, 2023

Vesyrak commented Aug 21, 2023 • edited Loading

Vesyrak commented Aug 21, 2023

Vesyrak commented Aug 21, 2023 •

edited

Loading