Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize mem utils functions #5658

Merged
merged 27 commits into from
Aug 30, 2022
Merged

Conversation

solotzg
Copy link
Contributor

@solotzg solotzg commented Aug 19, 2022

What problem does this PR solve?

Issue Number: ref #5294

What is changed and how it works?

  • add functions avx2_strstr, avx2_mem_equal, avx2_memchr
    • avx2_strstr is same as std::string_view::find
    • avx2_mem_equal is same as bcmp or std::memcmp(p1,p2,n) == 0
    • avx2_memchr is same as std::memchr
  • optimize string equality comparison.
    return mem_utils::memoryEqual(lhs.data, rhs.data, lhs.size);
    is much slower than std::memcmp if str size is not very big.
    • according to the test result, if str-size is more than 1000000, instructions about avx512 may begin to get better results
  • optimize expression like

Benchmark

ENV

  • tpch-100
  • 1 tiflash
  • limit cpu up to 200%
  • x86-64/amd64
  • original commit: 8404e65

BENCH_EQ_COLLATOR(UTF8MB4_BIN);
BENCH_EQ_COLLATOR(UTF8MB4_GENERAL_CI);
BENCH_EQ_COLLATOR(UTF8MB4_UNICODE_CI);
BENCH_EQ_COLLATOR(UTF8_BIN);
BENCH_EQ_COLLATOR(UTF8_GENERAL_CI);
BENCH_EQ_COLLATOR(UTF8_UNICODE_CI);
BENCH_EQ_COLLATOR(ASCII_BIN);
BENCH_EQ_COLLATOR(BINARY);
BENCH_EQ_COLLATOR(LATIN1_BIN);
BENCH_LIKE_COLLATOR(UTF8MB4_BIN);
BENCH_LIKE_COLLATOR(UTF8MB4_GENERAL_CI);
BENCH_LIKE_COLLATOR(UTF8MB4_UNICODE_CI);
BENCH_LIKE_COLLATOR(UTF8_BIN);
BENCH_LIKE_COLLATOR(UTF8_GENERAL_CI);
BENCH_LIKE_COLLATOR(UTF8_UNICODE_CI);
BENCH_LIKE_COLLATOR(ASCII_BIN);
BENCH_LIKE_COLLATOR(BINARY);
BENCH_LIKE_COLLATOR(LATIN1_BIN);

Time(ns) Original Optimized   Improvement: (Original) / (Optimized) - 1.0
CollationEqBench/UTF8MB4_BIN 12428711 6228798   99.54%
CollationEqBench/UTF8_BIN 12956705 6141843   110.96%
CollationEqBench/ASCII_BIN 12625723 6229335   102.68%
CollationEqBench/BINARY 11870078 5837615   103.34%
CollationEqBench/LATIN1_BIN 13768201 6732640   104.50%
CollationLikeBench/UTF8MB4_BIN 37940667 20185747   87.96%
CollationLikeBench/UTF8_BIN 37803575 19914106   89.83%
CollationLikeBench/ASCII_BIN 36860160 17999743   104.78%
CollationLikeBench/BINARY 37449881 17599053   112.79%
CollationLikeBench/LATIN1_BIN 37503432 17675036   112.18%

  • test bcmp, mem_utils::memoryEqual(use avx512) and avx2_mem_equal
  • test std::string_view::find and avx2_strstr
  • MemUtilsEqual_xxx means test str-size is xxx
  • MemUtilsStrStr_xxx_yyy means test src-str-size is xxx and needle-str-size is yyy
Time(ns) STL Original-avx512 Optimized-avx2 Improvement: (STL) / (Optimized) - 1.0 Improvement: (Original) / (Optimized) - 1.0
check mem eq: MemUtilsEqual_${str-size}          
MemUtilsEqual_13 4.46 7.22 4.15 7.47% 73.98%
MemUtilsEqual_65 4.88 8.69 4.31 13.23% 101.62%
MemUtilsEqual_100 9.9 13.3 5.65 75.22% 135.40%
MemUtilsEqual_10000 268 323 162 65.43% 99.38%
MemUtilsEqual_100000 3939 4353 3462 13.78% 25.74%
MemUtilsEqual_1000000 62265 53157 52600 18.37% 1.06%
           
str find: MemUtilsStrStr_${src-str-size}_${needle-str-size}          
MemUtilsStrStr_1024_1 30882   21275 45.16%  
MemUtilsStrStr_1024_7 34927   21279 64.14%  
MemUtilsStrStr_1024_15 39364   23161 69.96%  
MemUtilsStrStr_1024_31 40628   29435 38.03%  
MemUtilsStrStr_1024_63 37381   26141 43.00%  
MemUtilsStrStr_80_1 6130   3977 54.14%  
MemUtilsStrStr_80_7 11720   6278 86.68%  
MemUtilsStrStr_80_15 11585   5423 113.63%  
MemUtilsStrStr_80_31 11467   9530 20.33%  

SQL

select count(1) from orders where o_comment like '%pending%deposits%';
Time(s) Original Optimized     Improvement
  10.75 8.72      
  10.92 8.87      
  10.98 9.05      
  10.7 8.64      
  10.77 8.78     AVG(Original) / AVG(Optimized) - 1.0
AVG 10.824 8.812   Optimized : Original 22.83%

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@solotzg solotzg added type/enhancement The issue or PR belongs to an enhancement. type/performance labels Aug 19, 2022
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Aug 19, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • windtalker
  • zanmato1984

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 19, 2022
- add `avx2_strstr` to accelerate substr search
- add `avx2_mem_equal` to accelerate mem equal cmp
@pingcap pingcap deleted a comment from sre-bot Aug 25, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 25, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 25, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 25, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Aug 29, 2022

/run-all-tests

Copy link
Contributor

@zanmato1984 zanmato1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 29, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 29, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 29, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 29, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Aug 29, 2022

/hold

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 29, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Aug 29, 2022

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 8c85675

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 29, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 29, 2022
@pingcap pingcap deleted a comment from sre-bot Aug 29, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Aug 30, 2022

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@solotzg
Copy link
Contributor Author

solotzg commented Aug 30, 2022

/unhold

@ti-chi-bot ti-chi-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 30, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Aug 30, 2022

Coverage for changed files

Filename                                                        Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/Columns/ColumnString.cpp                                   172                81    52.91%          26                12    53.85%         402               204    49.25%         118                66    44.07%
dbms/src/Functions/CollationOperatorOptimized.h                     130                26    80.00%          11                 0   100.00%         201                 7    96.52%          78                 9    88.46%
dbms/src/Functions/CollationStringOptimized.cpp                       3                 0   100.00%           3                 0   100.00%           9                 0   100.00%           0                 0         -
dbms/src/Functions/CollationStringSearchOptimized.h                 141                 3    97.87%          20                 0   100.00%         302                 5    98.34%          84                 2    97.62%
dbms/src/Functions/FunctionsComparison.cpp                            8                 7    12.50%           8                 7    12.50%          42                29    30.95%           0                 0         -
dbms/src/Functions/FunctionsComparison.h                            604               304    49.67%          63                28    55.56%         949               505    46.79%         476               283    40.55%
dbms/src/Functions/FunctionsStringSearch.cpp                        645               344    46.67%          57                30    47.37%        1315               691    47.45%         410               223    45.61%
dbms/src/Functions/tests/gtest_strings_cmp.cpp                      112                14    87.50%           2                 0   100.00%          96                 0   100.00%          14                 8    42.86%
dbms/src/Storages/Transaction/CollatorUtils.h                        34                 4    88.24%          11                 1    90.91%          68                 8    88.24%          14                 1    92.86%
dbms/src/Storages/Transaction/tests/gtest_tidb_collator.cpp          23                 0   100.00%           6                 0   100.00%          91                 0   100.00%          14                 1    92.86%
libs/libcommon/include/common/StringRef.h                            49                12    75.51%          21                 5    76.19%          92                22    76.09%          26                12    53.85%
libs/libcommon/include/common/avx2_mem_utils.h                      215                21    90.23%          20                 3    85.00%         326                58    82.21%         118                 0   100.00%
libs/libcommon/include/common/avx2_strstr.h                         149                17    88.59%          14                 0   100.00%         229                31    86.46%          82                 3    96.34%
libs/libcommon/include/common/mem_utils.h                           127                34    73.23%           7                 0   100.00%         130                19    85.38%         116                37    68.10%
libs/libcommon/include/common/mem_utils_opt.h                         6                 1    83.33%           3                 1    66.67%          45                33    26.67%           2                 0   100.00%
libs/libcommon/src/avx2_mem_utils_impl.cpp                            5                 1    80.00%           5                 1    80.00%          15                 3    80.00%           0                 0         -
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                              2423               869    64.14%         277                88    68.23%        4312              1615    62.55%        1552               645    58.44%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18479      8325             54.95%    213806  85973        59.79%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot merged commit a8c8cb1 into pingcap:master Aug 30, 2022
@solotzg solotzg deleted the optimize-mem-utils branch August 30, 2022 02:03
@solotzg
Copy link
Contributor Author

solotzg commented Aug 30, 2022

/run-sanitizer-test asan

@pingcap pingcap deleted a comment from sre-bot Aug 30, 2022
@solotzg solotzg mentioned this pull request Aug 31, 2022
12 tasks
solotzg added a commit to solotzg/tiflash that referenced this pull request Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement. type/performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants