Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: trigger auto-analyze based on histogram row count #24382

Merged
merged 14 commits into from
Jul 28, 2021

Conversation

eurekaka
Copy link
Contributor

@eurekaka eurekaka commented Apr 29, 2021

What problem does this PR solve?

Issue Number: close #24237, related to #26282

Problem Summary:

Trigger auto analyze in a more proper way.

What is changed and how it works?

What's Changed:

Use the histogram row count as the base count for checking auto analyze, instead of the Count in Table.

Related changes

N/A

Check List

Tests

  • Unit test

Side effects

N/A

Release note

  • Trigger auto-analyze based on histogram row count

@eurekaka eurekaka added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner component/statistics labels Apr 29, 2021
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Apr 29, 2021

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • qw4990
  • time-and-fate

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@eurekaka eurekaka requested a review from a team as a code owner April 29, 2021 10:25
@eurekaka eurekaka requested review from XuHuaiyu and removed request for a team April 29, 2021 10:25
@ti-chi-bot ti-chi-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 29, 2021
@eurekaka eurekaka requested review from time-and-fate, qw4990 and winoros and removed request for XuHuaiyu April 29, 2021 10:26
@purelind
Copy link
Contributor

purelind commented May 2, 2021

/run-all-tests

statistics/handle/update.go Outdated Show resolved Hide resolved
Comment on lines +218 to +220
func (t *Table) ColHistCount() float64 {
for _, col := range t.Columns {
if col != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to check Table.Pseudo here.
Probably Column.IsInvalid() also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we only try to get row count from columns? I think indexes are also useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1, autoAnalyzeTable has already checked Table.Pseudo;
2, We should NOT check Column.IsInvalid(), since our purpose here is to get the row count of last analyze, even if Column.IsInvalid() is true, the row count is what we want;
3, We should NOT get row count from indexes, normally, index count should be same as column count, but if analyze index happens, index count is not what we want;

Copy link
Member

@time-and-fate time-and-fate May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. 👌

There is one special case. When the stats for a column is not loaded and there are NULLs in this column, the (*Column).TotalRowCount() will equal null count, which is incorrect. We need to check against this case.

Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that the newly added unit test can pass on the current master branch.

statistics/handle/update_test.go Outdated Show resolved Hide resolved
statistics/handle/update_test.go Outdated Show resolved Hide resolved
@eurekaka eurekaka force-pushed the auto_analyze branch 2 times, most recently from 690c99c to fa2a4c4 Compare May 8, 2021 10:07
@@ -214,6 +214,16 @@ func (t *Table) GetStatsInfo(ID int64, isIndex bool) (int64, *Histogram, *CMSket
return int64(colStatsInfo.TotalRowCount()), colStatsInfo.Histogram.Copy(), colStatsInfo.CMSketch.Copy(), colStatsInfo.TopN.Copy(), colStatsInfo.FMSketch.Copy()
}

// ColHistCount returns the count of the column histograms.
func (t *Table) ColHistCount() float64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name could be changed since the topn is split out of the histogram.

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 28, 2021
@qw4990
Copy link
Contributor

qw4990 commented Jul 28, 2021

/merge

@time-and-fate
Copy link
Member

/run-check_dev_2
/run-unit-test

@qw4990
Copy link
Contributor

qw4990 commented Jul 28, 2021

/merge

@time-and-fate
Copy link
Member

/run-unit-test

4 similar comments
@time-and-fate
Copy link
Member

/run-unit-test

@time-and-fate
Copy link
Member

/run-unit-test

@time-and-fate
Copy link
Member

/run-unit-test

@zhouqiang-cl
Copy link
Contributor

/run-unit-test

@ti-chi-bot ti-chi-bot merged commit 50ae2d8 into pingcap:master Jul 28, 2021
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jul 28, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #26706

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jul 28, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-5.0 in PR #26707

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jul 28, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-5.1 in PR #26708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics needs-cherry-pick-release-5.0 needs-cherry-pick-release-5.1 sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

change the way of checking auto analyze ratio
8 participants