-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics: trigger auto-analyze based on histogram row count #24382
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/run-all-tests |
func (t *Table) ColHistCount() float64 { | ||
for _, col := range t.Columns { | ||
if col != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to check Table.Pseudo
here.
Probably Column.IsInvalid()
also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we only try to get row count from columns? I think indexes are also useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1, autoAnalyzeTable
has already checked Table.Pseudo
;
2, We should NOT check Column.IsInvalid()
, since our purpose here is to get the row count of last analyze, even if Column.IsInvalid()
is true, the row count is what we want;
3, We should NOT get row count from indexes, normally, index count should be same as column count, but if analyze index
happens, index count is not what we want;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. 👌
There is one special case. When the stats for a column is not loaded and there are NULLs in this column, the (*Column).TotalRowCount()
will equal null count, which is incorrect. We need to check against this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find that the newly added unit test can pass on the current master branch.
690c99c
to
fa2a4c4
Compare
@@ -214,6 +214,16 @@ func (t *Table) GetStatsInfo(ID int64, isIndex bool) (int64, *Histogram, *CMSket | |||
return int64(colStatsInfo.TotalRowCount()), colStatsInfo.Histogram.Copy(), colStatsInfo.CMSketch.Copy(), colStatsInfo.TopN.Copy(), colStatsInfo.FMSketch.Copy() | |||
} | |||
|
|||
// ColHistCount returns the count of the column histograms. | |||
func (t *Table) ColHistCount() float64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name could be changed since the topn is split out of the histogram.
/merge |
/run-check_dev_2 |
/merge |
/run-unit-test |
4 similar comments
/run-unit-test |
/run-unit-test |
/run-unit-test |
/run-unit-test |
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
cherry pick to release-4.0 in PR #26706 |
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
cherry pick to release-5.0 in PR #26707 |
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
cherry pick to release-5.1 in PR #26708 |
What problem does this PR solve?
Issue Number: close #24237, related to #26282
Problem Summary:
Trigger auto analyze in a more proper way.
What is changed and how it works?
What's Changed:
Use the histogram row count as the base count for checking auto analyze, instead of the
Count
inTable
.Related changes
N/A
Check List
Tests
Side effects
N/A
Release note