Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn ZeroVec into a struct #2599

Merged
merged 13 commits into from
Sep 21, 2022
Merged

Conversation

Manishearth
Copy link
Member

This avoids branches in most cases by avoiding the enum discriminant check, instead stuffing the discriminant into the zero-ness of the vec.

Supersedes #2554

Seems to give a ~10% improvement, but some of our benches are flaky.

@Manishearth
Copy link
Member Author

Manishearth commented Sep 20, 2022

Running norm benches (with non icu4x benches commented out) gives me:

after this PR
     Running benches/norm.rs (target/release/deps/norm-f1bc19fd5893b7e3)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.

Gnuplot not found, using plotters backend
el_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [61.981 µs 62.066 µs 62.164 µs]
                        thrpt:  [290.22 Melem/s 290.67 Melem/s 291.07 Melem/s]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe

el_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [90.408 µs 90.814 µs 91.321 µs]
                        thrpt:  [197.56 Melem/s 198.66 Melem/s 199.55 Melem/s]
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe

el_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [96.359 µs 96.549 µs 96.768 µs]
                        thrpt:  [186.44 Melem/s 186.86 Melem/s 187.23 Melem/s]
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe

el_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [162.44 µs 162.94 µs 163.40 µs]
                        thrpt:  [121.19 Melem/s 121.54 Melem/s 121.91 Melem/s]

en_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [31.456 µs 31.530 µs 31.616 µs]
                        thrpt:  [2.0417 Gelem/s 2.0474 Gelem/s 2.0522 Gelem/s]

en_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [31.505 µs 31.539 µs 31.575 µs]
                        thrpt:  [2.0444 Gelem/s 2.0467 Gelem/s 2.0489 Gelem/s]
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

en_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [31.353 µs 32.363 µs 33.576 µs]
                        thrpt:  [1.9226 Gelem/s 1.9946 Gelem/s 2.0588 Gelem/s]
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

en_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [31.952 µs 32.018 µs 32.088 µs]
                        thrpt:  [2.0119 Gelem/s 2.0163 Gelem/s 2.0204 Gelem/s]

fr_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [68.362 µs 68.498 µs 68.648 µs]
                        thrpt:  [2.0088 Gelem/s 2.0132 Gelem/s 2.0172 Gelem/s]
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

fr_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [176.91 µs 177.44 µs 178.23 µs]
                        thrpt:  [773.72 Melem/s 777.16 Melem/s 779.51 Melem/s]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

fr_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [182.08 µs 182.77 µs 183.55 µs]
                        thrpt:  [751.27 Melem/s 754.48 Melem/s 757.34 Melem/s]
Found 15 outliers among 100 measurements (15.00%)
  7 (7.00%) high mild
  8 (8.00%) high severe

fr_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [337.12 µs 347.73 µs 358.97 µs]
                        thrpt:  [395.98 Melem/s 408.77 Melem/s 421.65 Melem/s]
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe

ja_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [48.702 µs 49.845 µs 51.271 µs]
                        thrpt:  [228.14 Melem/s 234.67 Melem/s 240.18 Melem/s]
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) high mild
  8 (8.00%) high severe

ja_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [59.947 µs 60.144 µs 60.336 µs]
                        thrpt:  [193.86 Melem/s 194.48 Melem/s 195.12 Melem/s]

ja_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [62.130 µs 62.217 µs 62.323 µs]
                        thrpt:  [187.68 Melem/s 188.00 Melem/s 188.27 Melem/s]
Found 15 outliers among 100 measurements (15.00%)
  10 (10.00%) high mild
  5 (5.00%) high severe

ja_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [85.399 µs 85.885 µs 86.591 µs]
                        thrpt:  [142.86 Melem/s 144.03 Melem/s 144.85 Melem/s]
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

kn_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [145.30 µs 145.75 µs 146.27 µs]
                        thrpt:  [150.49 Melem/s 151.01 Melem/s 151.48 Melem/s]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

kn_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [128.73 µs 129.03 µs 129.36 µs]
                        thrpt:  [170.15 Melem/s 170.58 Melem/s 170.99 Melem/s]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

kn_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [106.44 µs 106.87 µs 107.36 µs]
                        thrpt:  [205.03 Melem/s 205.96 Melem/s 206.80 Melem/s]
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe

kn_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [185.93 µs 186.89 µs 187.79 µs]
                        thrpt:  [121.23 Melem/s 121.81 Melem/s 122.44 Melem/s]

ko_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [35.644 µs 35.743 µs 35.876 µs]
                        thrpt:  [241.61 Melem/s 242.51 Melem/s 243.18 Melem/s]
Found 16 outliers among 100 measurements (16.00%)
  6 (6.00%) low severe
  5 (5.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

ko_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [133.84 µs 134.29 µs 134.79 µs]
                        thrpt:  [64.307 Melem/s 64.547 Melem/s 64.762 Melem/s]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

ko_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [70.572 µs 70.720 µs 70.892 µs]
                        thrpt:  [122.27 Melem/s 122.57 Melem/s 122.83 Melem/s]
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

ko_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [241.27 µs 241.64 µs 242.03 µs]
                        thrpt:  [69.350 Melem/s 69.462 Melem/s 69.570 Melem/s]
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

vi_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [119.79 µs 120.44 µs 121.14 µs]
                        thrpt:  [550.45 Melem/s 553.65 Melem/s 556.67 Melem/s]
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild

vi_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [475.63 µs 477.16 µs 478.77 µs]
                        thrpt:  [139.28 Melem/s 139.75 Melem/s 140.20 Melem/s]

vi_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [442.82 µs 445.44 µs 448.74 µs]
                        thrpt:  [148.60 Melem/s 149.70 Melem/s 150.58 Melem/s]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Benchmarking vi_nfd_to_nfc_utf16/icu4x: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.4s, enable flat sampling, or reduce sample count to 60.
vi_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [1.0177 ms 1.0202 ms 1.0232 ms]
                        thrpt:  [81.503 Melem/s 81.737 Melem/s 81.938 Melem/s]

vi_orthographic_to_nfc_utf16/icu4x                                                                            
                        time:   [913.50 µs 916.76 µs 920.27 µs]
                        thrpt:  [83.376 Melem/s 83.695 Melem/s 83.993 Melem/s]
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe

zh_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [41.674 µs 41.885 µs 42.096 µs]
                        thrpt:  [271.83 Melem/s 273.20 Melem/s 274.59 Melem/s]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

zh_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [42.084 µs 42.245 µs 42.417 µs]
                        thrpt:  [269.78 Melem/s 270.87 Melem/s 271.91 Melem/s]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

zh_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [42.226 µs 42.302 µs 42.382 µs]
                        thrpt:  [270.00 Melem/s 270.51 Melem/s 270.99 Melem/s]
Found 9 outliers among 100 measurements (9.00%)
  8 (8.00%) high mild
  1 (1.00%) high severe

zh_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [41.522 µs 41.681 µs 41.846 µs]
                        thrpt:  [273.48 Melem/s 274.56 Melem/s 275.61 Melem/s]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

before this PR
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.

Gnuplot not found, using plotters backend
el_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [67.637 µs 67.761 µs 67.923 µs]
                        thrpt:  [265.61 Melem/s 266.25 Melem/s 266.73 Melem/s]
                 change:
                        time:   [+8.9765% +9.2300% +9.4920%] (p = 0.00 < 0.05)
                        thrpt:  [-8.6691% -8.4500% -8.2371%]
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low severe
  4 (4.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

el_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [104.80 µs 106.78 µs 109.31 µs]
                        thrpt:  [165.05 Melem/s 168.96 Melem/s 172.14 Melem/s]
                 change:
                        time:   [+23.870% +27.564% +31.313%] (p = 0.00 < 0.05)
                        thrpt:  [-23.846% -21.608% -19.270%]
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

el_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [108.53 µs 110.65 µs 113.15 µs]
                        thrpt:  [159.44 Melem/s 163.05 Melem/s 166.23 Melem/s]
                 change:
                        time:   [+22.022% +25.758% +29.526%] (p = 0.00 < 0.05)
                        thrpt:  [-22.795% -20.482% -18.048%]
                        Performance has regressed.

el_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [166.20 µs 166.71 µs 167.26 µs]
                        thrpt:  [118.39 Melem/s 118.79 Melem/s 119.15 Melem/s]
                 change:
                        time:   [+3.4655% +4.1124% +4.7831%] (p = 0.00 < 0.05)
                        thrpt:  [-4.5648% -3.9500% -3.3494%]
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

en_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [31.003 µs 31.121 µs 31.268 µs]
                        thrpt:  [2.0644 Gelem/s 2.0742 Gelem/s 2.0821 Gelem/s]
                 change:
                        time:   [-2.1456% -1.7683% -1.3639%] (p = 0.00 < 0.05)
                        thrpt:  [+1.3827% +1.8001% +2.1926%]
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

en_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [30.901 µs 30.964 µs 31.028 µs]
                        thrpt:  [2.0805 Gelem/s 2.0847 Gelem/s 2.0890 Gelem/s]
                 change:
                        time:   [-2.0116% -1.8038% -1.5846%] (p = 0.00 < 0.05)
                        thrpt:  [+1.6101% +1.8370% +2.0529%]
                        Performance has improved.

en_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [30.874 µs 30.949 µs 31.030 µs]
                        thrpt:  [2.0803 Gelem/s 2.0858 Gelem/s 2.0908 Gelem/s]
                 change:
                        time:   [-4.7944% -3.4167% -2.2946%] (p = 0.00 < 0.05)
                        thrpt:  [+2.3485% +3.5375% +5.0358%]
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

en_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [32.335 µs 33.337 µs 34.624 µs]
                        thrpt:  [1.8645 Gelem/s 1.9365 Gelem/s 1.9965 Gelem/s]
                 change:
                        time:   [+1.4381% +4.8479% +9.3540%] (p = 0.02 < 0.05)
                        thrpt:  [-8.5538% -4.6238% -1.4177%]
                        Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
  3 (3.00%) high mild
  14 (14.00%) high severe

fr_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [67.977 µs 68.122 µs 68.285 µs]
                        thrpt:  [2.0195 Gelem/s 2.0243 Gelem/s 2.0286 Gelem/s]
                 change:
                        time:   [-1.1403% -0.7725% -0.4256%] (p = 0.00 < 0.05)
                        thrpt:  [+0.4274% +0.7785% +1.1535%]
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe

fr_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [185.05 µs 186.57 µs 188.38 µs]
                        thrpt:  [732.04 Melem/s 739.13 Melem/s 745.22 Melem/s]
                 change:
                        time:   [+3.6879% +4.7132% +5.7066%] (p = 0.00 < 0.05)
                        thrpt:  [-5.3985% -4.5010% -3.5567%]
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

fr_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [181.21 µs 181.53 µs 181.85 µs]
                        thrpt:  [758.33 Melem/s 759.64 Melem/s 760.98 Melem/s]
                 change:
                        time:   [-0.4383% -0.0312% +0.3986%] (p = 0.89 > 0.05)
                        thrpt:  [-0.3971% +0.0312% +0.4402%]
                        No change in performance detected.
Found 18 outliers among 100 measurements (18.00%)
  1 (1.00%) low mild
  12 (12.00%) high mild
  5 (5.00%) high severe

fr_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [330.62 µs 331.40 µs 332.19 µs]
                        thrpt:  [427.89 Melem/s 428.92 Melem/s 429.93 Melem/s]
                 change:
                        time:   [-2.4964% -1.0441% +0.2321%] (p = 0.15 > 0.05)
                        thrpt:  [-0.2316% +1.0552% +2.5603%]
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

ja_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [50.888 µs 51.089 µs 51.314 µs]
                        thrpt:  [227.95 Melem/s 228.96 Melem/s 229.86 Melem/s]
                 change:
                        time:   [+3.3507% +5.2586% +6.9075%] (p = 0.00 < 0.05)
                        thrpt:  [-6.4612% -4.9959% -3.2421%]
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

ja_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [62.166 µs 62.338 µs 62.507 µs]
                        thrpt:  [187.13 Melem/s 187.64 Melem/s 188.16 Melem/s]
                 change:
                        time:   [+3.9128% +4.2967% +4.6792%] (p = 0.00 < 0.05)
                        thrpt:  [-4.4700% -4.1197% -3.7654%]
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

ja_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [66.395 µs 67.344 µs 68.747 µs]
                        thrpt:  [170.15 Melem/s 173.69 Melem/s 176.17 Melem/s]
                 change:
                        time:   [+7.0942% +8.1448% +9.8184%] (p = 0.00 < 0.05)
                        thrpt:  [-8.9406% -7.5314% -6.6243%]
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

ja_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [88.329 µs 88.602 µs 88.895 µs]
                        thrpt:  [139.15 Melem/s 139.61 Melem/s 140.05 Melem/s]
                 change:
                        time:   [+2.8054% +3.3707% +3.8640%] (p = 0.00 < 0.05)
                        thrpt:  [-3.7203% -3.2608% -2.7289%]
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild

kn_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [157.87 µs 158.32 µs 158.78 µs]
                        thrpt:  [138.62 Melem/s 139.03 Melem/s 139.43 Melem/s]
                 change:
                        time:   [+8.2759% +8.6679% +9.0735%] (p = 0.00 < 0.05)
                        thrpt:  [-8.3187% -7.9765% -7.6433%]
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

kn_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [131.57 µs 132.12 µs 132.75 µs]
                        thrpt:  [165.81 Melem/s 166.60 Melem/s 167.29 Melem/s]
                 change:
                        time:   [+1.7772% +2.2609% +2.7659%] (p = 0.00 < 0.05)
                        thrpt:  [-2.6915% -2.2109% -1.7461%]
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

kn_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [110.67 µs 111.20 µs 111.74 µs]
                        thrpt:  [196.99 Melem/s 197.95 Melem/s 198.89 Melem/s]
                 change:
                        time:   [+5.0535% +6.2351% +7.5893%] (p = 0.00 < 0.05)
                        thrpt:  [-7.0539% -5.8692% -4.8104%]
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  10 (10.00%) high mild

kn_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [195.72 µs 199.69 µs 204.71 µs]
                        thrpt:  [111.21 Melem/s 114.00 Melem/s 116.31 Melem/s]
                 change:
                        time:   [+5.4278% +6.3716% +7.5833%] (p = 0.00 < 0.05)
                        thrpt:  [-7.0488% -5.9899% -5.1483%]
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

ko_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [33.604 µs 33.734 µs 33.900 µs]
                        thrpt:  [255.70 Melem/s 256.95 Melem/s 257.94 Melem/s]
                 change:
                        time:   [-5.7636% -5.4412% -5.0604%] (p = 0.00 < 0.05)
                        thrpt:  [+5.3302% +5.7543% +6.1161%]
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

ko_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [134.79 µs 135.15 µs 135.56 µs]
                        thrpt:  [63.944 Melem/s 64.134 Melem/s 64.306 Melem/s]
                 change:
                        time:   [+0.5459% +0.8830% +1.2344%] (p = 0.00 < 0.05)
                        thrpt:  [-1.2193% -0.8752% -0.5429%]
                        Change within noise threshold.

ko_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [74.864 µs 75.254 µs 75.660 µs]
                        thrpt:  [114.57 Melem/s 115.18 Melem/s 115.78 Melem/s]
                 change:
                        time:   [+5.8810% +6.4632% +7.1555%] (p = 0.00 < 0.05)
                        thrpt:  [-6.6777% -6.0708% -5.5544%]
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

ko_nfd_to_nfc_utf16/icu4x                                                                            
                        time:   [241.31 µs 247.44 µs 255.36 µs]
                        thrpt:  [65.730 Melem/s 67.835 Melem/s 69.557 Melem/s]
                 change:
                        time:   [+0.6417% +1.6570% +2.9418%] (p = 0.00 < 0.05)
                        thrpt:  [-2.8578% -1.6300% -0.6376%]
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

vi_nfc_to_nfc_utf16/icu4x                                                                            
                        time:   [124.69 µs 125.10 µs 125.53 µs]
                        thrpt:  [531.19 Melem/s 533.02 Melem/s 534.78 Melem/s]
                 change:
                        time:   [+3.2157% +3.6979% +4.1796%] (p = 0.00 < 0.05)
                        thrpt:  [-4.0119% -3.5661% -3.1155%]
                        Performance has regressed.

vi_nfc_to_nfd_utf16/icu4x                                                                            
                        time:   [465.79 µs 468.38 µs 472.41 µs]
                        thrpt:  [141.15 Melem/s 142.36 Melem/s 143.16 Melem/s]
                 change:
                        time:   [-1.7718% -1.3023% -0.7777%] (p = 0.00 < 0.05)
                        thrpt:  [+0.7838% +1.3195% +1.8037%]
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

vi_nfd_to_nfd_utf16/icu4x                                                                            
                        time:   [445.26 µs 449.68 µs 455.14 µs]
                        thrpt:  [146.51 Melem/s 148.29 Melem/s 149.76 Melem/s]
                 change:
                        time:   [-2.1258% -1.0673% +0.0487%] (p = 0.05 > 0.05)
                        thrpt:  [-0.0487% +1.0788% +2.1720%]
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Benchmarking vi_nfd_to_nfc_utf16/icu4x: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.3s, enable flat sampling, or reduce sample count to 60.
vi_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [1.0095 ms 1.0122 ms 1.0149 ms]
                        thrpt:  [82.168 Melem/s 82.388 Melem/s 82.610 Melem/s]
                 change:
                        time:   [-1.9598% -1.6293% -1.2849%] (p = 0.00 < 0.05)
                        thrpt:  [+1.3017% +1.6563% +1.9990%]
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

vi_orthographic_to_nfc_utf16/icu4x                                                                            
                        time:   [907.60 µs 922.09 µs 939.94 µs]
                        thrpt:  [81.631 Melem/s 83.211 Melem/s 84.540 Melem/s]
                 change:
                        time:   [-4.2790% -2.3358% -0.7037%] (p = 0.01 < 0.05)
                        thrpt:  [+0.7087% +2.3916% +4.4703%]
                        Change within noise threshold.
Found 19 outliers among 100 measurements (19.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  12 (12.00%) high severe

zh_nfc_to_nfc_utf16/icu4x                                                                             
                        time:   [42.788 µs 42.977 µs 43.186 µs]
                        thrpt:  [264.97 Melem/s 266.26 Melem/s 267.43 Melem/s]
                 change:
                        time:   [+2.2566% +2.9416% +3.6184%] (p = 0.00 < 0.05)
                        thrpt:  [-3.4920% -2.8576% -2.2068%]
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

zh_nfc_to_nfd_utf16/icu4x                                                                             
                        time:   [44.104 µs 44.306 µs 44.522 µs]
                        thrpt:  [257.02 Melem/s 258.27 Melem/s 259.45 Melem/s]
                 change:
                        time:   [+4.7983% +5.4028% +5.9657%] (p = 0.00 < 0.05)
                        thrpt:  [-5.6299% -5.1258% -4.5786%]
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild

zh_nfd_to_nfd_utf16/icu4x                                                                             
                        time:   [44.380 µs 44.721 µs 45.072 µs]
                        thrpt:  [253.88 Melem/s 255.88 Melem/s 257.84 Melem/s]
                 change:
                        time:   [+4.8371% +5.4482% +6.0798%] (p = 0.00 < 0.05)
                        thrpt:  [-5.7313% -5.1667% -4.6139%]
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild

zh_nfd_to_nfc_utf16/icu4x                                                                             
                        time:   [42.865 µs 43.016 µs 43.164 µs]
                        thrpt:  [265.13 Melem/s 266.04 Melem/s 266.98 Melem/s]
                 change:
                        time:   [+2.1959% +2.7174% +3.2212%] (p = 0.00 < 0.05)
                        thrpt:  [-3.1207% -2.6455% -2.1487%]
                        Performance has regressed.

I did run the benches post-PR first, so the negative deltas are good.

/// zerovec.with_mut(|v| v.push(12_u16.to_unaligned()));
/// assert!(zerovec.is_owned());
/// ```
pub fn with_mut<R>(&mut self, f: impl FnOnce(&mut Vec<T::ULE>) -> R) -> R {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the continuation-passing style. Many uses of this actually only need a mutable slice, which is something we can return. Other places, like in datagen, do things like pushing by repeatedly calling with_mut which I think is not very readable. They should assemble a Vec and then convert it into a ZeroVec at the end.

Copy link
Member Author

@Manishearth Manishearth Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added. I didn't move people off of with_mut, this might be a change we can do later. Ultimately I think such an API is still valuable to have.

robertbastian
robertbastian previously approved these changes Sep 21, 2022
utils/zerovec/src/zerovec/mod.rs Outdated Show resolved Hide resolved
robertbastian
robertbastian previously approved these changes Sep 21, 2022
@Manishearth Manishearth merged commit 0e395d9 into unicode-org:main Sep 21, 2022
@Manishearth
Copy link
Member Author

wasm build failed due to flakiness

@Manishearth Manishearth deleted the zerovec-structify branch September 21, 2022 19:07
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-submit review

where
T: AsULE + ?Sized,
T: AsULE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why did you drop the ?Sized ?

Copy link
Member Author

@Manishearth Manishearth Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's completely useless, this is for Copy stack types, there's no way to have a ?Sized type in ULE and have it work. The ULE methods won't be callable for such types.

Most of the ZeroVec methods also can't have ?Sized bounds

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is the AsULE. I can't think of a specific use case, but it doesn't seem that T: Sized needs to be a requirement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsULE returns owned values, so to do anything useful it has to be Sized

In general the idea is to be as generic as possible in trait bounds where more specific bounds are unnecessary, but I don't think that advice applies for opt out bounds like ?Sized, "be as generic as possible" is shorthand for "don't use unnecessary bounds"

ZeroVec::Owned(vec) => &**vec,
ZeroVec::Borrowed(slice) => *slice,
};
let slice = unsafe { &*self.buf };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Safety comment. It's not super clear what is being cast from and to.

Comment on lines +100 to +101
unsafe impl<'a, T: AsULE> Send for ZeroVec<'a, T> where T::ULE: Send + Sync {}
unsafe impl<'a, T: AsULE> Sync for ZeroVec<'a, T> where T::ULE: Sync {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Safety docs. Why does T::ULE: Sync imply ZeroVec<'a, T>: Sync?

///
/// # Example
///
/// ```rust,ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Don't ignore this example. Make it so that it runs and passes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, i wodner why this was ignore initially

/// ```
Borrowed(&'a [T::ULE]),
/// Pointer to data
buf: *mut [T::ULE],
Copy link
Member

@sffc sffc Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Don't you need a manual Drop impl to clean this up? (I think it can just call self.into_cow())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh shoot, yeah

@@ -433,10 +454,28 @@ where
let ptr = ptr as *mut P::ULE;
Vec::from_raw_parts(ptr, len, cap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: There are now multiple places where we convert back and forth between a vec and its raw parts (here, into_cow, and new_owned). Consider making them share code or at least use the same style with similar safety comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They already share code as much as possible; we deconstruct the vec in one place (new_owned) and construct it once (into_cow). There's also try_into_converted but that's about constructing a vec of a different type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants