Fine performance metrics: measure time spent entering/exiting thread pools #7681

crusaderky · 2023-03-17T17:55:02Z

Part of Fine performance metrics meta-issue #7665
Related to Compression slows down network comms #7655
Related to Task should be executing upon rejoining. #5882

When investigating #7655, I got the suspicion that we may have a bottleneck caused by the fact that there's only one worker in the offload executor.
Additionally, there's a known issue (#5882) where, whenever a task calls rejoin(), you will have one less thread available then what the WorkerStateMachine believes is available, which will cause Worker.execute to get stuck for a long time on the call to run_in_executor.

In both cases, as of #7586 this time is displayed as ("execute", <prefix>, "other", "seconds").
Wrap these two calls to run_in_executor in with context_meter.meter("offload") and with context_meter.meter("executor") respectively.

The text was updated successfully, but these errors were encountered:

crusaderky added the diagnostics label Mar 17, 2023

crusaderky mentioned this issue Mar 17, 2023

Fine performance metrics meta-issue #7665

Open

crusaderky mentioned this issue Apr 6, 2023

Meter queue time to the offload executor #7758

Merged

hendrikmakait closed this as completed in #7758 Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine performance metrics: measure time spent entering/exiting thread pools #7681

Fine performance metrics: measure time spent entering/exiting thread pools #7681

crusaderky commented Mar 17, 2023 •

edited

Loading

Fine performance metrics: measure time spent entering/exiting thread pools #7681

Fine performance metrics: measure time spent entering/exiting thread pools #7681

Comments

crusaderky commented Mar 17, 2023 • edited Loading

crusaderky commented Mar 17, 2023 •

edited

Loading