tikv: fix infinite follower/learner retry when network partition only between leader and follower/learner (#17441) #17443
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cherry-pick #17441 to release-4.0
What problem does this PR solve?
Issue Number: close #17442
Problem Summary:
in #16933 we introduce a mechanism that rechecks store liveness when sending requests failed, it works well for leader based requests.
but for follower or learner requests, this may introduce infinitely retry.
when there is a network partition between the leader and followers/leaners, but accessible between TiDB-Server and followers and leaners, followers and learner will return timeout error when they can not catch up with leader due to network partition, but rechecks store liveness still can success, but it's better to retry other peers immediately in this situation.
What is changed and how it works?
What's Changed:
do retry immediately instead of check store liveness when it's a follower/learner read.
Related changes
Check List
Tests
Side effects
Release note
This change is