-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BR suffers a 15x performance regression when single TiKV node down #42973
Comments
The reason is:
|
BTW, the current implementation will directly jump to the fine-grained backup when there is any store unreachable: Lines 85 to 90 in 0548d61
This may slow down the backup speed in this scenario. |
There isn't trivial fix to the problem. I think we must figure out why PD will return wrong state of store firstly (Note that |
@nolouch , please take a look. The question is what's the best way to tell if the store is healthy or not. Apparently StoreState_Up seems not always accurate as it may need to wait for 30 minutes to be StoreState_Down |
br using |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
2. What did you expect to see? (Required)
The backup should be slightly slower than backing up a healthy cluster.
3. What did you see instead (Required)
The backup speed is about 15x slower than backing up a healthy cluster. (4 mins vs 1 hour)
4. What is your TiDB version? (Required)
Near master, but this problem is not strong relative to the version.
The text was updated successfully, but these errors were encountered: