-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics: fix repetitive selectivity accounting and stabilify the result #15536
Conversation
// getUsableSetsByGreedy will select the indices and pk used for calculate selectivity by greedy algorithm. | ||
func getUsableSetsByGreedy(nodes []*StatsNode) (newBlocks []*StatsNode) { | ||
// GetUsableSetsByGreedy will select the indices and pk used for calculate selectivity by greedy algorithm. | ||
func GetUsableSetsByGreedy(nodes []*StatsNode) (newBlocks []*StatsNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it impact the performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, because the input order is changed, so the result of greedy search may change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
do we have this problem in release 3.0 and release 3.1? |
@zz-jason Yes |
cherry pick to release-2.1 in PR #16050 |
cherry pick to release-3.0 in PR #16051 |
cherry pick to release-3.1 in PR #16052 |
Signed-off-by: sre-bot <sre-bot@pingcap.com>
What problem does this PR solve?
Problem Summary:
Selectivity
, index order incoll.Indices
is non-deterministic, so the greedy search algorithm may return different results in different runs, that would confuse users since the stats is not changed at all;t.a = 1 and t.b > 1 and t.c > 1
, and there are 2 indexesidx1(a,b)
andidx2(a,c)
, the greedy algorithm would choose both indexes and multiply their selectivity computed respectively. Obviously, this is wrong, because selectivity oft.a = 1
is accounted twice.What is changed and how it works?
What's Changed:
StatsNode
slice before greedy search;How it Works:
Note that, how we sort the
StatsNode
slice impacts the greedy search result. I put the PK in the end of the slice, indexes in the middle and columns in the front, to enforce the heuristic rule that, PK is preferred over indexes in estimation, and indexes are preferred over columns.Related changes
Check List
Tests
Side effects
compareType
function, instead of changing the values ofIndexType
/PkType
/ColType
, because feedback encoding uses these constants.Release note