You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we running restore, we found that the region is unbalanced.
Then, we have checked the PD dashboard, and found that there are many failure of scatter:
According to the DEBUG log, we found the reason of failed to create scatter operator is mainly unhealthy. But the cluster looks fine. (Those empty regions are freshly split and would be filled with data soon)
Note that this metric may not accurate enough.
We also find during scattering, there are many add-rule-peer operator created.
I'm wondering: What is the reason of that failure? How can I get why those add-rule-peer operator created, are they relative to the failure?
The problem of add-rule-peer is tracked by #4565 and it has been fixed. The reason why the operator fails is that the region has pending peers. We can do a check before starting the scatter region. See pingcap/tidb#31691.
Bug Report
cc pingcap/tidb#31034
What did you do?
Execute
ScatterRegions
over some fresh regions created byBatchSplitRegion
.What did you expect to see?
The regions are scattered and balanced.
What did you see instead?
Some of the operators failed to be created because of
unhealthy
. And the final region isn't balanced.What version of PD are you using (
pd-server -V
)?Note
(The details of metric and cluster info TBD)
The text was updated successfully, but these errors were encountered: