Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize block sort #5908

Merged
merged 10 commits into from
Sep 23, 2022
Merged

Conversation

solotzg
Copy link
Contributor

@solotzg solotzg commented Sep 16, 2022

What problem does this PR solve?

Issue Number: ref #5294

What is changed and how it works?

  • optimize when number of sort key is 2 and type of sort key is one of
    • UInt64
    • Int64
    • StringBinPadding
    • StringBinNoPadding
    • StringWithCollatorGeneric
  • we can optimize the case that number of sort key is 3 later if necessary (there will be a lot of template instantiations)

Benchmark

ENV

  • tpch-100
  • 1 tiflash
  • limit cpu up to 2000%
  • original commit: 6d0cbc8

SQL

EXPLAIN analyze SELECT ROW_NUMBER() OVER w1 FROM PART window w1 AS (PARTITION BY P_MFGR order by P_SIZE);
  • sort key: STR(utf8 collator),UINT64
Time(s) Original Optimized     Improvement
  8.16 7.18      
  8.22 6.94      
  8.29 7.35      
  8.34 7.04      
  8.24 7.37     AVG(Original) / AVG(Optimized) - 1.0
AVG 8.25 7.176   Optimized : Original 14.97%

SQL

EXPLAIN analyze SELECT ROW_NUMBER() OVER w1 FROM PART window w1 AS (PARTITION BY P_MFGR ORDER BY p_name);
  • sort key: STR(utf8 collator),STR(utf8 collator)
Time(s) Original Optimized     Improvement
  18.3 16.34      
  18.27 16.62      
  18.7 16.57      
  18.57 16.52      
  18.55 16.63     AVG(Original) / AVG(Optimized) - 1.0
AVG 18.478 16.536   Optimized : Original 11.74%

SQL

EXPLAIN analyze SELECT ROW_NUMBER() OVER w1 FROM PART window w1 AS (PARTITION BY p_name ORDER BY p_partkey);
  • sort key: INT64,STR(utf8 collator)
  • costs of block sort are about 30%
Time(s) Original Optimized     Improvement
  9.71 9.28      
  9.62 9.41      
  9.76 9.25      
  9.73 9.26      
  9.67 9.28     AVG(Original) / AVG(Optimized) - 1.0
AVG 9.698 9.296   Optimized : Original 4.32%

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Signed-off-by: Zhigao Tong <tongzhigao@pingcap.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 16, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • LittleFall
  • windtalker

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 16, 2022
@solotzg solotzg added type/enhancement The issue or PR belongs to an enhancement. type/performance and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 16, 2022
@ti-chi-bot ti-chi-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Sep 16, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 16, 2022

/run-all-tests

@solotzg
Copy link
Contributor Author

solotzg commented Sep 16, 2022

/run-unit-test

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 16, 2022

Coverage for changed files

Filename                                Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Flash/tests/gtest_topn_executor.cpp         118                22    81.36%           6                 0   100.00%         235                 2    99.15%          54                13    75.93%
Interpreters/sortBlock.cpp                  200                23    88.50%          28                 0   100.00%         305                 8    97.38%         116                11    90.52%
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                       318                45    85.85%          34                 0   100.00%         540                10    98.15%         170                24    85.88%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18728      8128             56.60%    216616  83550        61.43%

full coverage report (for internal network access only)

@solotzg
Copy link
Contributor Author

solotzg commented Sep 18, 2022

/run-all-tests

1 similar comment
@solotzg
Copy link
Contributor Author

solotzg commented Sep 19, 2022

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 19, 2022

Coverage for changed files

Filename                                         Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/Flash/tests/gtest_topn_executor.cpp          96                22    77.08%           6                 0   100.00%         228                 2    99.12%          50                13    74.00%
dbms/src/Interpreters/sortBlock.cpp                  200                23    88.50%          28                 0   100.00%         307                 8    97.39%         116                11    90.52%
libs/libcommon/include/common/avx2_strstr.h          149                17    88.59%          14                 0   100.00%         236                38    83.90%          82                 3    96.34%
libs/libcommon/include/common/sse2_memcpy.h           46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 1    94.44%
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                491                62    87.37%          49                 0   100.00%         882                75    91.50%         266                28    89.47%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18885      8119             57.01%    219788  83654        61.94%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 19, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 19, 2022

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 19, 2022

Coverage for changed files

Filename                                             Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/Interpreters/sortBlock.cpp                      200                23    88.50%          28                 0   100.00%         307                 8    97.39%         116                11    90.52%
dbms/src/Interpreters/tests/gtest_block_sort.cpp          46                 7    84.78%           1                 0   100.00%          99                 0   100.00%          28                 4    85.71%
libs/libcommon/include/common/avx2_strstr.h              149                17    88.59%          14                 0   100.00%         236                38    83.90%          82                 3    96.34%
libs/libcommon/include/common/sse2_memcpy.h               46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 1    94.44%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                    441                47    89.34%          44                 0   100.00%         753                73    90.31%         244                19    92.21%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18885      8120             57.00%    219788  83685        61.92%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 21, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 21, 2022

It seems that the optimization is all from move branches out of loops and make branches static?

from the offline discussion, we can materialize the collator look-up column, to decrease the collator look-up times from O(nlogn) to O(n)

It may also be an optional way if the overhead about materialization is small.

Copy link
Contributor

@LittleFall LittleFall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider add a FastPathPermutationSort<1>, which is commoner than 2 columns


// only for uint64, int64, string
template <typename ColumnCmpA, typename ColumnCmpB>
struct MultiColumnSortFastPath
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MultiColumn is is actullay two column?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if use 3 columns, there will be too many template instantiations. We may need to optimize later if necessary.

}
};

struct CollatorDesc : boost::noncopyable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use CollatorDesc as the name, maybe use FastSortDesc instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
};

struct PartialSortingLessWithCollation
{
const ColumnsWithSortDescriptions & columns;
const CollatorDesc & collator_desc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using CollatorDesc instead of ColumnsWithSortDescriptions seems does not make too much difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartialSortingLessWithCollation uses member variables has_collation and need_collations of CollatorDesc

@solotzg
Copy link
Contributor Author

solotzg commented Sep 22, 2022

consider add a FastPathPermutationSort<1>, which is commoner than 2 columns

We already have such optimization :

if (description.size() == 1)
{
bool reverse = description[0].direction == -1;
const IColumn * column = !description[0].column_name.empty()
? block.getByName(description[0].column_name).column.get()
: block.safeGetByPosition(description[0].column_number).column.get();
IColumn::Permutation perm;
if (needCollation(column, description[0]))
column->getPermutation(*description[0].collator, reverse, limit, description[0].nulls_direction, perm);
else
column->getPermutation(reverse, limit, description[0].nulls_direction, perm);
size_t columns = block.columns();
for (size_t i = 0; i < columns; ++i)
block.safeGetByPosition(i).column = block.safeGetByPosition(i).column->permute(perm, limit);
}
else

Copy link
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

others LGTM

}
};

struct PartialSortingLessWithCollation
{
const ColumnsWithSortDescriptions & columns;
const FastSortDesc & collator_desc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name should also be modified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 23, 2022
Signed-off-by: Zhigao Tong <tongzhigao@pingcap.com>
@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/hold

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: f117304

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 23, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/rebuild

@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/run-all-tests

@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/rebuild

@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/run-integration-test

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 23, 2022

Coverage for changed files

Filename                                             Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/Interpreters/sortBlock.cpp                      200                24    88.00%          28                 0   100.00%         307                 8    97.39%         116                12    89.66%
dbms/src/Interpreters/tests/gtest_block_sort.cpp          46                 7    84.78%           1                 0   100.00%          99                 0   100.00%          28                 4    85.71%
libs/libcommon/include/common/avx2_strstr.h              149                17    88.59%          14                 0   100.00%         236                38    83.90%          82                 3    96.34%
libs/libcommon/include/common/sse2_memcpy.h               46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 0   100.00%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                    441                48    89.12%          44                 0   100.00%         753                73    90.31%         244                19    92.21%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18343      7506             59.08%    214602  77144        64.05%

full coverage report (for internal network access only)

@solotzg
Copy link
Contributor Author

solotzg commented Sep 23, 2022

/unhold

@ti-chi-bot ti-chi-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2022
@ti-chi-bot ti-chi-bot merged commit 4f66bc3 into pingcap:master Sep 23, 2022
@solotzg solotzg deleted the optimize-sort-collation branch September 23, 2022 07:18
@sre-bot
Copy link
Collaborator

sre-bot commented Sep 23, 2022

Coverage for changed files

Filename                                             Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/Interpreters/sortBlock.cpp                      200                24    88.00%          28                 0   100.00%         307                 8    97.39%         116                12    89.66%
dbms/src/Interpreters/tests/gtest_block_sort.cpp          46                 7    84.78%           1                 0   100.00%          99                 0   100.00%          28                 4    85.71%
libs/libcommon/include/common/avx2_strstr.h              149                17    88.59%          14                 0   100.00%         236                38    83.90%          82                 3    96.34%
libs/libcommon/include/common/sse2_memcpy.h               46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 0   100.00%
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                    441                48    89.12%          44                 0   100.00%         753                73    90.31%         244                19    92.21%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18343      7504             59.09%    214602  77121        64.06%

full coverage report (for internal network access only)

solotzg added a commit to solotzg/tiflash that referenced this pull request Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement. type/performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants