Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double counting of network transfer cost #7003

Open
Tracked by #6993
fjetter opened this issue Sep 5, 2022 · 3 comments
Open
Tracked by #6993

Double counting of network transfer cost #7003

fjetter opened this issue Sep 5, 2022 · 3 comments

Comments

@fjetter
Copy link
Member

fjetter commented Sep 5, 2022

We're double counting estimated network cost in multiple places

Forst, we're calculating the estimated network cost of dependencies a worker needs to fetch in _set_duration_estimate and are setting the result to WorkerState.processing, i.e. processing = compute + comm
This is also used to set the workers occupancy

When making a scheduling decision, we're typically using Scheduler.worker_objective which calculates a start_time that is defined as

stack_time: float = ws.occupancy / ws.nthreads
start_time: float = stack_time + comm_bytes / self.bandwidth

i.e.

start_time = ws.occupancy / ws.nthreads + comm_bytes / self.bandwidth
        = ws.occupancy / ws.nthreads + comm_cost

occupancy ~ sum( ... TaskPrefix.duration_average + comm_cost )
  1. comm cost should be constant and not scale with nthreads
  2. we should only account for comm_cost once

A similar double counting is introduced on work stealing side when calculating the cost_multiplier

compute_time = ws.processing[ts]  # occupancy
transfer_time = nbytes / self.scheduler.bandwidth + LATENCY
cost_multiplier = transfer_time / compute_time

# If we ignore latency for now, this yields something like

cost_multiplier ~ NBytes / (Bandwidth * duration_average + NBytes)
    =  (NBytes / Bandwidth) / (duration_average + NBytes / Bandwidth)

i.e. for network heavy tasks, this converges towards 1 which is quite the opposite of what this ratio is supposed to encode

@fjetter
Copy link
Member Author

fjetter commented Sep 7, 2022

There is another double/multiple counting problem in _set_duration_estimate that concerns tasks with shared dependencies.

_set_duration_estimate is evaluated once per task w/out any regard of shared dependencies. Therefore, specifically for graphs where N tasks share one common node, this nodes transfer cost is vastly overestimated since it is counted N times.

@fjetter
Copy link
Member Author

fjetter commented Sep 7, 2022

This double counting can be catastrophic for cases where transfer cost is potentially larger or of similar size than compute. Apart from an erroneous worker_objective, this can lead to misclassification of idle workers which then causes very aggressive work stealing where all tasks are stolen by the worker with the dependency. An extreme example is #6573

@fjetter
Copy link
Member Author

fjetter commented Sep 8, 2022

This double counting appears to go back to #773

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant