Avoid work-stealing in bytecode compilation #4004

ibraheemdev · 2024-06-04T00:11:02Z

Summary

Avoid using work-stealing Tokio workers for bytecode compilation, favoring instead dedicated threads. Tokio's work-stealing does not really benefit us because we're spawning Python workers and scheduling tasks ourselves — we don't want Tokio to re-balance our workers. Because we're doing scheduling ourselves and compilation is a primarily compute-bound task, we can also create dedicated runtimes for each worker and avoid some synchronization overhead.

This is part of a general desire to avoid relying on Tokio's work-stealing scheduler and be smarter about our workload. In this case we already had the custom scheduler in place, Tokio was just getting in the way (though the overhead is very minor).

Test Plan

This improves performance by ~5% on my machine.

$ hyperfine --warmup 1 --prepare "target/profiling/uv-dev clear-compile .venv" "target/profiling/uv-dev compile .venv" "target/profiling/uv-dev-dedicated compile .venv"
Benchmark 1: target/profiling/uv-dev compile .venv
  Time (mean ± σ):      1.279 s ±  0.011 s    [User: 13.803 s, System: 2.998 s]
  Range (min … max):    1.261 s …  1.296 s    10 runs
 
Benchmark 2: target/profiling/uv-dev-dedicated compile .venv
  Time (mean ± σ):      1.220 s ±  0.021 s    [User: 13.997 s, System: 3.330 s]
  Range (min … max):    1.198 s …  1.272 s    10 runs

Summary
  target/profiling/uv-dev-dedicated compile .venv ran
    1.05 ± 0.02 times faster than target/profiling/uv-dev compile .venv

$ hyperfine --warmup 1 --prepare "target/profiling/uv-dev clear-compile .venv" "target/profiling/uv-dev compile .venv" "target/profiling/uv-dev-dedicated compile .venv"
Benchmark 1: target/profiling/uv-dev compile .venv
  Time (mean ± σ):      3.631 s ±  0.078 s    [User: 47.205 s, System: 4.996 s]
  Range (min … max):    3.564 s …  3.832 s    10 runs
 
Benchmark 2: target/profiling/uv-dev-dedicated compile .venv
  Time (mean ± σ):      3.521 s ±  0.024 s    [User: 48.201 s, System: 5.392 s]
  Range (min … max):    3.484 s …  3.566 s    10 runs
 
Summary
  target/profiling/uv-dev-dedicated compile .venv ran
    1.03 ± 0.02 times faster than target/profiling/uv-dev compile .venv

konstin

Cool find, i didn't realize we were paying a premium for work-stealing!

konstin · 2024-06-04T08:49:08Z

crates/uv-installer/src/compile.rs

+                        .enable_all()
+                        .build()
+                        .expect("Failed to build runtime")
+                        .block_on(worker)


Does it still makes sense for the worker to be async? We're not really using any async-specific features there except for timeouts.

I looked into making them non-async very briefly, but getting the timeouts to work with synchronous I/O is not trivial so I left it for now. I don't think it's a huge deal because most of the work is run on a separate process anyways, but we could probably get some minor gains from stripping out the async.

crates/uv-installer/src/compile.rs

BurntSushi · 2024-06-04T10:50:36Z

crates/uv-installer/src/compile.rs

+            .name("uv-compile".to_owned())
+            .spawn(move || {
+                // Report panics back to the main thread.
+                let result = panic::catch_unwind(AssertUnwindSafe(|| {


I'm probably missing something silly here, but why the catch_unwind? You have a thread boundary here, so you should be able to just extract the panic from the result of joining the thread?

We wait on the threads asynchronously, which is why we use the oneshot channel instead of blocking on join.

Hmmm okay, I think that makes sense to me. Thanks!

## Summary Move completely off tokio's multi-threaded runtime. We've slowly been making changes to be smarter about scheduling in various places instead of depending on tokio's general purpose work-stealing, notably #3627 and #4004. We now no longer benefit from the multi-threaded runtime, as we run on all I/O on the main thread. There's one remaining instance of `block_in_place` that can be swapped for `rayon::spawn`. This change is a small performance improvement due to removing some unnecessary overhead of the multi-threaded runtime (e.g. spawning threads), but nothing major. It also removes some noise from profiles. ## Test Plan ``` Benchmark 1: ./target/profiling/uv (resolve-warm) Time (mean ± σ): 14.9 ms ± 0.3 ms [User: 3.0 ms, System: 17.3 ms] Range (min … max): 14.1 ms … 15.8 ms 169 runs Benchmark 2: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 16.1 ms ± 0.3 ms [User: 3.9 ms, System: 18.7 ms] Range (min … max): 15.1 ms … 17.3 ms 162 runs Summary ./target/profiling/uv (resolve-warm) ran 1.08 ± 0.03 times faster than ./target/profiling/baseline (resolve-warm) ```

avoid work-stealing in bytecode compilation

9088f32

ibraheemdev requested review from konstin and BurntSushi June 4, 2024 00:11

charliermarsh approved these changes Jun 4, 2024

View reviewed changes

konstin approved these changes Jun 4, 2024

View reviewed changes

konstin added the performance Potential performance improvement label Jun 4, 2024

BurntSushi approved these changes Jun 4, 2024

View reviewed changes

more detailed panic message

8a5528c

ibraheemdev merged commit 3b8f3a7 into astral-sh:main Jun 4, 2024
46 checks passed

BrewTestBot mentioned this pull request Jun 5, 2024

uv 0.2.7 Homebrew/homebrew-core#173799

Merged

ibraheemdev mentioned this pull request Jul 9, 2024

Switch to Current-Thread Tokio Runtime #4934

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid work-stealing in bytecode compilation #4004

Avoid work-stealing in bytecode compilation #4004

ibraheemdev commented Jun 4, 2024 •

edited

Loading

konstin left a comment

konstin Jun 4, 2024

ibraheemdev Jun 4, 2024 •

edited

Loading

BurntSushi Jun 4, 2024

ibraheemdev Jun 4, 2024

BurntSushi Jun 4, 2024

Avoid work-stealing in bytecode compilation #4004

Avoid work-stealing in bytecode compilation #4004

Conversation

ibraheemdev commented Jun 4, 2024 • edited Loading

Summary

Test Plan

konstin left a comment

Choose a reason for hiding this comment

konstin Jun 4, 2024

Choose a reason for hiding this comment

ibraheemdev Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

BurntSushi Jun 4, 2024

Choose a reason for hiding this comment

ibraheemdev Jun 4, 2024

Choose a reason for hiding this comment

BurntSushi Jun 4, 2024

Choose a reason for hiding this comment

ibraheemdev commented Jun 4, 2024 •

edited

Loading

ibraheemdev Jun 4, 2024 •

edited

Loading