Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark -C codegen-units=N #39

Closed
llogiq opened this issue Feb 20, 2018 · 2 comments
Closed

benchmark -C codegen-units=N #39

llogiq opened this issue Feb 20, 2018 · 2 comments

Comments

@llogiq
Copy link
Owner

llogiq commented Feb 20, 2018

Apparently, the default changed to codegen-units=16. However, when benchmarking I don't see a clear winner between N=1 or N=16 – it shows some small win on byte counting and some small loss on num_chars, but in both cases, it's a wash. Perhaps someone else can benchmark on a beefier machine?

@Deewiant
Copy link

I don't know what's "beefier" but I gave it an attempt with my i7-5930K @ 4.3 GHz. This is what cargo benchcmp says, excluding cases where the diff was either within 2 ns/iter or below 1.00%:

 name                               units1 ns/iter  units16 ns/iter  diff ns/iter  diff   speedup
 bench_count_00250_naive            69              74                          5   7.25%   x 0.93
 bench_count_01700_naive            463             458                        -5  -1.08%   x 1.01
 bench_count_02500_32               323             318                        -5  -1.55%   x 1.02
 bench_count_06000_32               764             748                       -16  -2.09%   x 1.02
 bench_count_09000_hyper            402             398                        -4  -1.00%   x 1.01
 bench_count_12000_32               1,521           1,490                     -31  -2.04%   x 1.02
 bench_count_12000_naive            3,231           3,191                     -40  -1.24%   x 1.01
 bench_count_17000_hyper            751             776                        25   3.33%   x 0.97
 bench_count_21000_hyper            919             964                        45   4.90%   x 0.95
 bench_count_25000_hyper            1,100           1,129                      29   2.64%   x 0.97
 bench_count_25000_naive            6,647           6,728                      81   1.22%   x 0.99
 bench_count_30000_hyper            1,305           1,369                      64   4.90%   x 0.95
 bench_num_chars_00900_naive        357             352                        -5  -1.40%   x 1.01
 bench_num_chars_01200_hyper        54              58                          4   7.41%   x 0.93
 bench_num_chars_01400_naive        543             549                         6   1.10%   x 0.99
 bench_num_chars_01700_naive        665             656                        -9  -1.35%   x 1.01
 bench_num_chars_02100_naive        819             808                       -11  -1.34%   x 1.01
 bench_num_chars_03000_naive        1,166           1,151                     -15  -1.29%   x 1.01
 bench_num_chars_07000_naive        2,707           2,673                     -34  -1.26%   x 1.01
 bench_num_chars_12000_naive        4,632           4,574                     -58  -1.25%   x 1.01
 bench_num_chars_17000_hyper        699             680                       -19  -2.72%   x 1.03
 bench_num_chars_21000_hyper        863             844                       -19  -2.20%   x 1.02
 bench_num_chars_21000_naive        7,993           8,098                     105   1.31%   x 0.99
 bench_num_chars_25000_hyper        1,024           1,001                     -23  -2.25%   x 1.02
 bench_num_chars_25000_naive        9,518           9,636                     118   1.24%   x 0.99
 bench_num_chars_30000_hyper        1,225           1,199                     -26  -2.12%   x 1.02
 bench_num_chars_30000_naive        11,418          11,584                    166   1.45%   x 0.99
 bench_num_chars_big_1000000_hyper  42,797          43,347                    550   1.29%   x 0.99

In short, it looks like a wash to me too. I also ran the comparison with the AVX options:

 name                               avx-units1 ns/iter  avx-units16 ns/iter  diff ns/iter   diff   speedup
 bench_count_00030_32               15                  12                             -3  -20.00%   x 1.25
 bench_count_00400_hyper            13                  16                              3   23.08%   x 0.81
 bench_count_01200_hyper            19                  22                              3   15.79%   x 0.86
 bench_count_01400_hyper            23                  26                              3   13.04%   x 0.88
 bench_count_01700_hyper            20                  24                              4   20.00%   x 0.83
 bench_count_02100_hyper            27                  30                              3   11.11%   x 0.90
 bench_count_02500_hyper            27                  30                              3   11.11%   x 0.90
 bench_count_02500_naive            395                 390                            -5   -1.27%   x 1.01
 bench_count_04000_hyper            35                  39                              4   11.43%   x 0.90
 bench_count_04000_naive            628                 620                            -8   -1.27%   x 1.01
 bench_count_05000_hyper            46                  50                              4    8.70%   x 0.92
 bench_count_05000_naive            787                 776                           -11   -1.40%   x 1.01
 bench_count_06000_32               477                 471                            -6   -1.26%   x 1.01
 bench_count_06000_hyper            54                  58                              4    7.41%   x 0.93
 bench_count_06000_naive            939                 928                           -11   -1.17%   x 1.01
 bench_count_08000_32               628                 621                            -7   -1.11%   x 1.01
 bench_count_08000_hyper            64                  69                              5    7.81%   x 0.93
 bench_count_08000_naive            1,253               1,238                         -15   -1.20%   x 1.01
 bench_count_09000_naive            1,410               1,393                         -17   -1.21%   x 1.01
 bench_count_10000_32               791                 780                           -11   -1.39%   x 1.01
 bench_count_10000_naive            1,563               1,545                         -18   -1.15%   x 1.01
 bench_count_12000_32               940                 929                           -11   -1.17%   x 1.01
 bench_count_12000_naive            1,876               1,854                         -22   -1.17%   x 1.01
 bench_count_14000_32               1,101               1,088                         -13   -1.18%   x 1.01
 bench_count_17000_naive            2,658               2,629                         -29   -1.09%   x 1.01
 bench_count_21000_32               1,645               1,626                         -19   -1.16%   x 1.01
 bench_count_21000_naive            3,282               3,244                         -38   -1.16%   x 1.01
 bench_count_25000_32               1,956               1,933                         -23   -1.18%   x 1.01
 bench_count_25000_naive            3,905               3,864                         -41   -1.05%   x 1.01
 bench_count_30000_32               2,348               2,324                         -24   -1.02%   x 1.01
 bench_count_big_0100000_hyper      1,095               1,071                         -24   -2.19%   x 1.02
 bench_count_big_1000000_hyper      25,438              24,621                       -817   -3.21%   x 1.03
 bench_num_chars_01200_naive        233                 230                            -3   -1.29%   x 1.01
 bench_num_chars_05000_naive        971                 960                           -11   -1.13%   x 1.01
 bench_num_chars_06000_naive        1,161               1,147                         -14   -1.21%   x 1.01
 bench_num_chars_10000_naive        1,933               1,910                         -23   -1.19%   x 1.01
 bench_num_chars_14000_naive        2,705               2,674                         -31   -1.15%   x 1.01
 bench_num_chars_17000_naive        3,287               3,253                         -34   -1.03%   x 1.01
 bench_num_chars_21000_hyper        238                 233                            -5   -2.10%   x 1.02
 bench_num_chars_25000_hyper        281                 277                            -4   -1.42%   x 1.01
 bench_num_chars_big_1000000_hyper  26,756              25,764                       -992   -3.71%   x 1.04

The differences are bigger but it still feels like noise to me.

@llogiq
Copy link
Owner Author

llogiq commented Feb 27, 2018

Thank you! That confirms my suspicion – there's either just not enough code to make a difference or thinLTO removes what difference is left.

@llogiq llogiq closed this as completed Feb 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants