-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store/tikv: drop the unreachable store's regions from cache. #2792
Conversation
store/tikv/region_cache_test.go
Outdated
@@ -254,6 +254,28 @@ func (s *testRegionCacheSuite) TestRequestFail(c *C) { | |||
c.Assert(region.unreachableStores, HasLen, 0) | |||
} | |||
|
|||
func (s *testRegionCacheSuite) TestRequestFail2(c *C) { | |||
// ['' - 'm' - 'z'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments: key range:
store/tikv/region_cache_test.go
Outdated
c.Assert(err, IsNil) | ||
c.Assert(loc2.Region.id, Equals, region2) | ||
|
||
// Request fails on region1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/fails/should fail/
store/tikv/region_cache_test.go
Outdated
s.cache.OnRequestFail(ctx) | ||
// Both region2 and store should be dropped from cache. | ||
c.Assert(s.cache.storeMu.stores, HasLen, 0) | ||
c.Assert(s.cache.getRegionFromCache([]byte("x")), IsNil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard for me to understand this test case==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It first splits the cluster to 2 regions (region1, region2) while their leaders are on the same store (store1). After request fails on region1, we check if region2 and store1 are removed from cache.
LGTM |
PTAL @ngaut |
// Both region2 and store should be dropped from cache. | ||
c.Assert(s.cache.storeMu.stores, HasLen, 0) | ||
c.Assert(s.cache.getRegionFromCache([]byte("x")), IsNil) | ||
s.checkCache(c, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only one store is used, why there is still one region in the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
region1 is not removed, we have to try its other peers before reload it from pd, because pd may not be updated in time.
LGTM |
When a store is down and it contains a lot of regions (maybe up to 10k+), tidb will suffer from "connection refused" error for a long time.
cc @AndreMouche @shenli @hhkbp2