Double counting of network transfer cost #7003

fjetter · 2022-09-05T15:19:44Z

We're double counting estimated network cost in multiple places

Forst, we're calculating the estimated network cost of dependencies a worker needs to fetch in _set_duration_estimate and are setting the result to WorkerState.processing, i.e. processing = compute + comm
This is also used to set the workers occupancy

When making a scheduling decision, we're typically using Scheduler.worker_objective which calculates a start_time that is defined as

distributed/distributed/scheduler.py

Lines 3000 to 3001 in b133009

    
           stack_time: float = ws.occupancy / ws.nthreads 
        
           start_time: float = stack_time + comm_bytes / self.bandwidth

i.e.

start_time = ws.occupancy / ws.nthreads + comm_bytes / self.bandwidth
        = ws.occupancy / ws.nthreads + comm_cost

occupancy ~ sum( ... TaskPrefix.duration_average + comm_cost )

comm cost should be constant and not scale with nthreads
we should only account for comm_cost once

A similar double counting is introduced on work stealing side when calculating the cost_multiplier

compute_time = ws.processing[ts]  # occupancy
transfer_time = nbytes / self.scheduler.bandwidth + LATENCY
cost_multiplier = transfer_time / compute_time

# If we ignore latency for now, this yields something like

cost_multiplier ~ NBytes / (Bandwidth * duration_average + NBytes)
    =  (NBytes / Bandwidth) / (duration_average + NBytes / Bandwidth)

i.e. for network heavy tasks, this converges towards 1 which is quite the opposite of what this ratio is supposed to encode

The text was updated successfully, but these errors were encountered:

fjetter · 2022-09-07T13:28:21Z

There is another double/multiple counting problem in _set_duration_estimate that concerns tasks with shared dependencies.

_set_duration_estimate is evaluated once per task w/out any regard of shared dependencies. Therefore, specifically for graphs where N tasks share one common node, this nodes transfer cost is vastly overestimated since it is counted N times.

fjetter · 2022-09-07T13:33:49Z

This double counting can be catastrophic for cases where transfer cost is potentially larger or of similar size than compute. Apart from an erroneous worker_objective, this can lead to misclassification of idle workers which then causes very aggressive work stealing where all tasks are stolen by the worker with the dependency. An extreme example is #6573

fjetter · 2022-09-08T12:29:30Z

This double counting appears to go back to #773

fjetter added diagnostics performance scheduling stealing scheduler labels Sep 5, 2022

This was referenced Sep 5, 2022

Timeboxed push for simplifying work stealing #6993

Closed

Differentiate between compute and network based occupancy #7004

Open

fjetter mentioned this issue Sep 7, 2022

Root-ish tasks all schedule onto one worker #6573

Closed

gjoseph92 mentioned this issue Sep 7, 2022

Track CPU and network occupancy separately #7020

Closed

2 tasks

This was referenced Sep 9, 2022

Allow very fast keys and very expensive transfers as stealing candidates #7022

Merged

No longer double count transfer cost in stealing #7026

Closed

Accurate occupancy calculation / occupancy replacement #7027

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Double counting of network transfer cost #7003

Double counting of network transfer cost #7003

fjetter commented Sep 5, 2022 •

edited

Loading

fjetter commented Sep 7, 2022

fjetter commented Sep 7, 2022

fjetter commented Sep 8, 2022

Double counting of network transfer cost #7003

Double counting of network transfer cost #7003

Comments

fjetter commented Sep 5, 2022 • edited Loading

fjetter commented Sep 7, 2022

fjetter commented Sep 7, 2022

fjetter commented Sep 8, 2022

fjetter commented Sep 5, 2022 •

edited

Loading