statistics: batch insert topn and bucket when saving table stats (#35326) #35548
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cherry-pick #35326 to release-5.3
You can switch your code base to this Pull Request by using git-extras:
# In tidb repo: git pr https://github.com/pingcap/tidb/pull/35548
After apply modifications, you can push your change to this PR via:
What problem does this PR solve?
Issue Number: ref #35142
Problem Summary:
Analyze partition table is slower than analyze non-partition table with the same amount of data.
What is changed and how it works?
In
SaveTableStatsToStorage
, we execute one insert statement for each topn and each bucket so there are too many insertions in the transaction and make the function time-consuming. The PR batches insertions for topn and bucket and makeSaveTableStatsToStorage
more efficient.Check List
Tests
For a table with 40 million rows and 20 partitions, analyze takes 9min and
SaveTableStatsToStorage
for one partition takes 21s. After the PR, analyze takes 3min andSaveTableStatsToStorage
for one partition takes 4s.Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.