stats: do not split excluded lower value ranges #12009

alivxxx · 2019-09-03T11:27:51Z

What problem does this PR solve?

Fix #11907

This bug would occur on indices that have multiply columns like index idx(a,b), and the bug happens when:

Queries that only request ranges on prefix like where a >= 10 are issued and collected by feedback, which result in one of the bucket upper bound becomes single encoded value (10), and there is a chance that next buckets's upper bound is (10, 11).
Then another queries comes like where a >= 9, so SplitRange is called with [9, +inf), so we will split by (10) and (10, 11), which result in ranges [9, 10], (10,(10,11)],..., and every thing looks fine now.
The caller, which is IndexRangesToKVRanges, will process the splited ranges (10,(10,11)]. Since lower is excluded, (10,(10,11)] will be transformed to [11, (10,12)) by using PrefixNext, so the invalid ranges happens.

What is changed and how it works?

When we split ranges, do not generate execluded lower ranges. This PR does it by split the ranges by lower bound and always generate included lower ranges and excluded upper ranges.

Check List

Tests

Unit test

Code changes

Has exported function/method change

Side effects

None

Related changes

Need to cherry-pick to the release branch

Release note

Write release note for bug-fix or new feature.

codecov · 2019-09-03T11:33:30Z

Codecov Report

Merging #12009 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #12009   +/-   ##
===========================================
  Coverage   81.5929%   81.5929%           
===========================================
  Files           452        452           
  Lines         98060      98060           
===========================================
  Hits          80010      80010           
  Misses        12401      12401           
  Partials       5649       5649

winoros

Can we append MaxValueDatum/MinValueDatum when we found that the length of the upper and lower is not the same?

alivxxx · 2019-09-04T09:06:13Z

@winoros Main reason that I did not choose this solution is that there are old histograms and it is not easy to determine the original number of columns, because it may already been PrefixNext many times and it may not decodeable. To use this approach we need to increase the stats version and only use feedback on newly created histograms.

lzmhhh123

LGTM.

statistics/histogram.go

eurekaka · 2019-09-11T08:16:22Z

statistics/handle/update.go

@@ -766,11 +766,11 @@ func formatBuckets(hg *statistics.Histogram, lowBkt, highBkt, idxCols int) strin
 		return hg.BucketToString(lowBkt, idxCols)
 	}
 	if lowBkt+1 == highBkt {
-		return fmt.Sprintf("%s, %s", hg.BucketToString(lowBkt, 0), hg.BucketToString(highBkt, 0))
+		return fmt.Sprintf("%s, %s", hg.BucketToString(lowBkt, idxCols), hg.BucketToString(highBkt, idxCols))


Why do we need this change?

Without it, the result for index is unreadable.

Co-Authored-By: Kenan Yao <cauchy1992@gmail.com>

eurekaka

LGTM

sre-bot · 2019-09-11T08:57:44Z

/run-all-tests

sre-bot · 2019-09-11T09:05:05Z

cherry pick to release-2.1 failed

sre-bot · 2019-09-11T09:07:26Z

cherry pick to release-3.0 failed

stats: do not split excluded lower value ranges

4082983

alivxxx added type/bugfix This PR fixes a bug. component/statistics needs-cherry-pick-2.1 labels Sep 3, 2019

alivxxx requested review from eurekaka, winoros and lzmhhh123 September 3, 2019 11:27

winoros reviewed Sep 4, 2019

View reviewed changes

lzmhhh123 reviewed Sep 9, 2019

View reviewed changes

lzmhhh123 added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 9, 2019

eurekaka reviewed Sep 11, 2019

View reviewed changes

Update statistics/histogram.go

e8cd4ac

Co-Authored-By: Kenan Yao <cauchy1992@gmail.com>

alivxxx requested a review from eurekaka September 11, 2019 08:37

eurekaka approved these changes Sep 11, 2019

View reviewed changes

Merge branch 'master' into fb

db5050f

eurekaka added status/LGT2 Indicates that a PR has LGTM 2. status/can-merge Indicates a PR has been approved by a committer. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 11, 2019

sre-bot merged commit 440bb74 into pingcap:master Sep 11, 2019

alivxxx deleted the fb branch September 12, 2019 06:29

alivxxx added a commit to alivxxx/tidb that referenced this pull request Sep 12, 2019

stats: do not split excluded lower value ranges (pingcap#12009)

e4e2552

alivxxx mentioned this pull request Sep 12, 2019

stats: do not split excluded lower value ranges (#12009) #12170

Merged

alivxxx added a commit to alivxxx/tidb that referenced this pull request Sep 12, 2019

stats: do not split excluded lower value ranges (pingcap#12009)

18758e5

alivxxx mentioned this pull request Sep 12, 2019

stats: do not split excluded lower value ranges (#12009) #12171

Merged

alivxxx added a commit to alivxxx/tidb that referenced this pull request Sep 12, 2019

stats: do not split excluded lower value ranges (pingcap#12009)

e282ae3

alivxxx mentioned this pull request Sep 12, 2019

stats: do not split excluded lower value ranges (#12009) #12172

Merged

sre-bot pushed a commit that referenced this pull request Sep 13, 2019

stats: do not split excluded lower value ranges (#12009) (#12172)

c3c04c6

sre-bot pushed a commit that referenced this pull request Sep 16, 2019

stats: do not split excluded lower value ranges (#12009) (#12170)

1d489fb

sre-bot pushed a commit that referenced this pull request Sep 16, 2019

stats: do not split excluded lower value ranges (#12009) (#12171)

84bc4da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stats: do not split excluded lower value ranges #12009

stats: do not split excluded lower value ranges #12009

alivxxx commented Sep 3, 2019

codecov bot commented Sep 3, 2019 •

edited

Loading

winoros left a comment

alivxxx commented Sep 4, 2019

lzmhhh123 left a comment

eurekaka Sep 11, 2019

alivxxx Sep 11, 2019

eurekaka left a comment

sre-bot commented Sep 11, 2019

sre-bot commented Sep 11, 2019

sre-bot commented Sep 11, 2019

stats: do not split excluded lower value ranges #12009

stats: do not split excluded lower value ranges #12009

Conversation

alivxxx commented Sep 3, 2019

What problem does this PR solve?

What is changed and how it works?

Check List

codecov bot commented Sep 3, 2019 • edited Loading

Codecov Report

winoros left a comment

Choose a reason for hiding this comment

alivxxx commented Sep 4, 2019

lzmhhh123 left a comment

Choose a reason for hiding this comment

eurekaka Sep 11, 2019

Choose a reason for hiding this comment

alivxxx Sep 11, 2019

Choose a reason for hiding this comment

eurekaka left a comment

Choose a reason for hiding this comment

sre-bot commented Sep 11, 2019

sre-bot commented Sep 11, 2019

sre-bot commented Sep 11, 2019

codecov bot commented Sep 3, 2019 •

edited

Loading