Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: only mitigate false sharing for multi-threaded runtimes #6240

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Darksonn
Copy link
Contributor

@Darksonn Darksonn commented Dec 22, 2023

In #5809, we padded the task struct to mitigate false sharing when different worker threads poll different tasks. However, for runtimes with only a single worker thread, this doesn't make sense. This patch makes it so that we don't require a high alignment for runtimes without multiple threads, where padding the task struct makes it consume more memory for no gain.

This updates src/util/cacheline.rs to match what the task struct was annotated with. I don't know why they were different.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Dec 22, 2023
@github-actions github-actions bot added R-loom-current-thread Run loom current-thread tests on this PR R-loom-multi-thread Run loom multi-thread tests on this PR R-loom-multi-thread-alt Run loom multi-thread alt tests on this PR labels Dec 22, 2023
@Darksonn
Copy link
Contributor Author

Darksonn commented Dec 22, 2023

Regarding benchmarks, the largest ones I got are these:

     Running rt_current_thread.rs (target/release/deps/rt_current_thread-fe50f3fb58496d08)
spawn_many_local        time:   [209.79 µs 210.83 µs 212.15 µs]
                        change: [-97.549% -97.512% -97.476%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

spawn_many_remote_idle  time:   [239.74 µs 240.31 µs 241.03 µs]
                        change: [-95.952% -95.801% -95.639%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
     Running rt_multi_threaded.rs (target/release/deps/rt_multi_threaded-20ec79d3770a25c9)
spawn_many_local        time:   [5.1625 ms 5.2042 ms 5.2483 ms]
                        change: [+2297.0% +2325.3% +2353.5%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

spawn_many_remote_idle  time:   [4.8253 ms 4.9544 ms 5.0961 ms]
                        change: [+1910.2% +1962.7% +2019.5%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  2 (2.00%) high mild
  16 (16.00%) high severe

However, this PR really shouldn't change the multi-threaded runtime, so I'm not sure what's going on.

Click me to view benchmark results
copy_mem_to_mem         time:   [9.3048 µs 9.3243 µs 9.3474 µs]
                        change: [+6.7828% +7.5201% +8.1000%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Benchmarking copy_mem_to_slow_hdd: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 18.1s, or reduce sample count to 20.
copy_mem_to_slow_hdd    time:   [181.16 ms 181.24 ms 181.33 ms]
                        change: [-0.0354% +0.0304% +0.0935%] (p = 0.36 > 0.05)
                        No change in performance detected.

Benchmarking copy_chunk_to_mem: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 12.8s, or reduce sample count to 30.
copy_chunk_to_mem       time:   [128.11 ms 128.21 ms 128.31 ms]
                        change: [-0.0644% +0.0388% +0.1425%] (p = 0.47 > 0.05)
                        No change in performance detected.

Benchmarking copy_chunk_to_slow_hdd: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.8s, or reduce sample count to 30.
copy_chunk_to_slow_hdd  time:   [145.63 ms 147.42 ms 149.24 ms]
                        change: [-2.5665% -0.8195% +0.9305%] (p = 0.37 > 0.05)
                        No change in performance detected.

     Running fs.rs (target/release/deps/fs-e0d6305d77b16750)
async_read_std_file     time:   [844.95 µs 847.05 µs 849.34 µs]
                        change: [-1.2503% -0.9644% -0.6505%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

async_read_buf          time:   [7.0472 ms 7.1294 ms 7.2176 ms]
                        change: [-5.5588% -3.4824% -1.2786%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

async_read_codec        time:   [7.2902 ms 7.3829 ms 7.4799 ms]
                        change: [-3.0102% -1.1389% +0.8707%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

sync_read               time:   [735.82 µs 737.61 µs 739.60 µs]
                        change: [-1.7575% -1.0153% -0.4313%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low mild
  6 (6.00%) high mild
  3 (3.00%) high severe

     Running rt_current_thread.rs (target/release/deps/rt_current_thread-fe50f3fb58496d08)
spawn_many_local        time:   [209.79 µs 210.83 µs 212.15 µs]
                        change: [-97.549% -97.512% -97.476%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

spawn_many_remote_idle  time:   [239.74 µs 240.31 µs 241.03 µs]
                        change: [-95.952% -95.801% -95.639%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

spawn_many_remote_busy  time:   [5.4833 ms 5.5163 ms 5.5531 ms]
                        change: [-1.7702% -0.6116% +0.4426%] (p = 0.30 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) high mild
  4 (4.00%) high severe

     Running rt_multi_threaded.rs (target/release/deps/rt_multi_threaded-20ec79d3770a25c9)
spawn_many_local        time:   [5.1625 ms 5.2042 ms 5.2483 ms]
                        change: [+2297.0% +2325.3% +2353.5%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

spawn_many_remote_idle  time:   [4.8253 ms 4.9544 ms 5.0961 ms]
                        change: [+1910.2% +1962.7% +2019.5%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  2 (2.00%) high mild
  16 (16.00%) high severe

spawn_many_remote_busy1 time:   [4.5141 ms 4.5364 ms 4.5624 ms]
                        change: [-6.1875% -5.3199% -4.4934%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

spawn_many_remote_busy2 time:   [32.864 ms 32.929 ms 32.994 ms]
                        change: [-1.8810% -1.5219% -1.1253%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

ping_pong               time:   [624.89 µs 628.28 µs 631.71 µs]
                        change: [-34.778% -34.248% -33.746%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

yield_many              time:   [10.198 ms 10.277 ms 10.365 ms]
                        change: [+3.0197% +4.1983% +5.3684%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

chained_spawn           time:   [380.13 µs 386.18 µs 394.32 µs]
                        change: [-0.4208% +1.1919% +2.7754%] (p = 0.15 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

     Running signal.rs (target/release/deps/signal-e709577e254b6a33)
many_signals            time:   [21.473 µs 21.549 µs 21.649 µs]
                        change: [+0.3674% +0.9749% +1.5873%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

     Running spawn.rs (target/release/deps/spawn-9518bb8b6f1b520e)
basic_scheduler_spawn   time:   [550.66 ns 561.36 ns 576.17 ns]
                        change: [+4.6971% +6.0237% +7.7708%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

basic_scheduler_spawn10 time:   [3.6620 µs 3.6766 µs 3.6952 µs]
                        change: [-12.073% -11.642% -11.103%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) high mild
  7 (7.00%) high severe

threaded_scheduler_spawn
                        time:   [6.4225 µs 6.5954 µs 6.8083 µs]
                        change: [+0.6278% +5.0076% +9.4991%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 23 outliers among 100 measurements (23.00%)
  18 (18.00%) high mild
  5 (5.00%) high severe

threaded_scheduler_spawn10
                        time:   [8.6431 µs 8.7400 µs 8.8531 µs]
                        change: [-5.2724% -4.1764% -3.1462%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
  1 (1.00%) low severe
  8 (8.00%) low mild
  8 (8.00%) high mild
  2 (2.00%) high severe

     Running sync_mpsc.rs (target/release/deps/sync_mpsc-5752b70c2db3d9fe)
create_medium/1         time:   [181.36 ns 182.27 ns 183.62 ns]
                        change: [-4.2276% -3.8645% -3.4842%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
create_medium/100       time:   [211.19 ns 211.59 ns 212.05 ns]
                        change: [+7.6278% +8.1449% +8.6524%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
create_medium/100000    time:   [211.59 ns 213.14 ns 214.86 ns]
                        change: [+7.6512% +8.1855% +8.7846%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

send/medium_1000        time:   [628.06 ns 651.06 ns 677.97 ns]
                        change: [-2.3102% -0.7496% +1.0315%] (p = 0.42 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
send/large_1000         time:   [21.280 µs 21.380 µs 21.492 µs]
                        change: [+0.4674% +1.2264% +2.0506%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe

contention/bounded      time:   [818.38 µs 830.56 µs 842.56 µs]
                        change: [-5.2379% -3.5729% -1.7886%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild
contention/bounded_recv_many
                        time:   [705.19 µs 714.52 µs 724.10 µs]
                        change: [-1.7431% -0.2526% +1.3561%] (p = 0.75 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
Benchmarking contention/bounded_full: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.5s, enable flat sampling, or reduce sample count to 60.
contention/bounded_full time:   [1.0567 ms 1.0724 ms 1.0874 ms]
                        change: [-2.6902% -0.8583% +1.0087%] (p = 0.35 > 0.05)
                        No change in performance detected.
contention/bounded_full_recv_many
                        time:   [709.17 µs 716.19 µs 724.00 µs]
                        change: [-4.4813% -2.8904% -1.3288%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild
contention/unbounded    time:   [765.71 µs 773.89 µs 782.23 µs]
                        change: [-1.3765% -0.3048% +0.7719%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
contention/unbounded_recv_many
                        time:   [741.55 µs 744.57 µs 747.30 µs]
                        change: [+2.6443% +3.3143% +3.9333%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

uncontented/bounded     time:   [533.45 µs 534.91 µs 536.54 µs]
                        change: [-9.9180% -9.2967% -8.7062%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
uncontented/bounded_recv_many
                        time:   [348.37 µs 349.77 µs 351.41 µs]
                        change: [-12.698% -12.424% -12.138%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
uncontented/unbounded   time:   [314.09 µs 314.76 µs 315.58 µs]
                        change: [-1.6096% -1.0156% -0.4660%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
uncontented/unbounded_recv_many
                        time:   [218.86 µs 219.73 µs 220.90 µs]
                        change: [+1.1229% +1.6840% +2.2978%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

     Running sync_mpsc_oneshot.rs (target/release/deps/sync_mpsc_oneshot-6fca575cbc451a46)
request_reply           time:   [332.53 µs 336.54 µs 340.36 µs]
                        change: [-2.7364% -1.9756% -1.3073%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 17 outliers among 100 measurements (17.00%)
  14 (14.00%) high mild
  3 (3.00%) high severe

request_reply #2        time:   [6.1791 ms 6.2325 ms 6.2907 ms]
                        change: [-3.7488% -1.5390% +0.5505%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

     Running sync_notify.rs (target/release/deps/sync_notify-b57b1e9d43acb764)
notify_one/10           time:   [140.64 µs 145.58 µs 150.01 µs]
                        change: [+3.4068% +8.4144% +13.374%] (p = 0.00 < 0.05)
                        Performance has regressed.
notify_one/50           time:   [175.98 µs 177.96 µs 179.82 µs]
                        change: [-0.8943% +0.9071% +2.9080%] (p = 0.36 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low severe
  6 (6.00%) low mild
  4 (4.00%) high mild
notify_one/100          time:   [176.62 µs 178.08 µs 179.58 µs]
                        change: [-2.4467% -1.1869% +0.1156%] (p = 0.07 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
notify_one/200          time:   [181.43 µs 182.70 µs 183.82 µs]
                        change: [-0.4677% +0.3650% +1.1893%] (p = 0.40 > 0.05)
                        No change in performance detected.
notify_one/500          time:   [173.51 µs 174.67 µs 175.90 µs]
                        change: [-5.0042% -3.9236% -2.9413%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

notify_waiters/10       time:   [318.30 µs 325.34 µs 331.81 µs]
                        change: [-1.6981% +0.8929% +3.3472%] (p = 0.48 > 0.05)
                        No change in performance detected.
notify_waiters/50       time:   [171.80 µs 175.12 µs 179.17 µs]
                        change: [-4.5028% -2.1952% +0.5742%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
notify_waiters/100      time:   [173.99 µs 176.61 µs 179.30 µs]
                        change: [+7.2544% +9.2132% +11.004%] (p = 0.00 < 0.05)
                        Performance has regressed.
notify_waiters/200      time:   [168.20 µs 169.97 µs 171.78 µs]
                        change: [-2.3022% -0.8143% +0.7264%] (p = 0.32 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
notify_waiters/500      time:   [183.00 µs 184.95 µs 186.72 µs]
                        change: [-0.3740% +1.8240% +3.8883%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) low mild
  1 (1.00%) high mild

     Running sync_rwlock.rs (target/release/deps/sync_rwlock-56cf725d80301d8f)
contention/read_concurrent
                        time:   [608.57 ns 610.03 ns 611.75 ns]
                        change: [+4.2862% +4.7392% +5.1771%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe
contention/read_concurrent_multi
                        time:   [9.5178 µs 9.7528 µs 10.017 µs]
                        change: [-2.9701% +0.1629% +3.4391%] (p = 0.92 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

uncontented/read        time:   [428.79 ns 431.37 ns 434.28 ns]
                        change: [+10.541% +11.139% +11.733%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe
uncontented/read_concurrent
                        time:   [555.19 ns 556.64 ns 558.27 ns]
                        change: [-7.2184% -6.6975% -6.2231%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
uncontented/read_concurrent_multi
                        time:   [10.233 µs 10.363 µs 10.494 µs]
                        change: [+4.6271% +7.0024% +9.3325%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

     Running sync_semaphore.rs (target/release/deps/sync_semaphore-b06ddbea289b0faf)
contention/concurrent_multi
                        time:   [9.4801 µs 9.6512 µs 9.8238 µs]
                        change: [-8.2682% -6.3414% -4.1513%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
contention/concurrent_single
                        time:   [599.33 ns 600.24 ns 601.20 ns]
                        change: [-6.7998% -6.5221% -6.2633%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) low mild
  6 (6.00%) high mild
  2 (2.00%) high severe

uncontented/multi       time:   [424.36 ns 426.61 ns 429.17 ns]
                        change: [+2.0118% +2.8018% +3.5373%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
uncontented/concurrent_multi
                        time:   [9.0743 µs 9.3270 µs 9.5928 µs]
                        change: [-10.676% -8.2759% -5.8102%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
uncontented/concurrent_single
                        time:   [601.14 ns 602.51 ns 604.09 ns]
                        change: [-0.5622% -0.1958% +0.2107%] (p = 0.32 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe

     Running sync_watch.rs (target/release/deps/sync_watch-60ac25831855913c)
Benchmarking contention_resubscribe/10: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.8s, enable flat sampling, or reduce sample count to 50.
contention_resubscribe/10
                        time:   [1.8530 ms 1.8948 ms 1.9403 ms]
                        change: [-3.4263% -0.8413% +1.6641%] (p = 0.52 > 0.05)
                        No change in performance detected.
contention_resubscribe/100
                        time:   [7.0210 ms 7.1514 ms 7.2854 ms]
                        change: [-3.8380% -0.9263% +1.8749%] (p = 0.53 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
contention_resubscribe/500
                        time:   [26.274 ms 26.747 ms 27.299 ms]
                        change: [-0.1387% +1.9717% +4.1448%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
Benchmarking contention_resubscribe/1000: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.2s, or reduce sample count to 90.
contention_resubscribe/1000
                        time:   [50.212 ms 50.619 ms 51.056 ms]
                        change: [-3.0960% -1.4999% +0.1245%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

     Running time_now.rs (target/release/deps/time_now-2001bfb0b3e6704f)
time_now_current_thread time:   [157.83 ns 158.38 ns 159.01 ns]
                        change: [+2.3068% +2.8169% +3.3379%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

@carllerche
Copy link
Member

A 2000% perf regression isn't nothing. Given that this is happening in spawn_many benchmarks, I could see something material has changed in the code path. It could be the new code is preventing inlining or something. This will probably require digging into the generated assembly to compare.

Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should dig into the benchmark regression first.

(leaving this as a review to block merging)

@Darksonn
Copy link
Contributor Author

The benchmarks are nonsense. See #6243 for the fix. New benchmarks on their way ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime R-loom-current-thread Run loom current-thread tests on this PR R-loom-multi-thread Run loom multi-thread tests on this PR R-loom-multi-thread-alt Run loom multi-thread alt tests on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants