-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug Report: vttablet replica is considered healthy (serving) even when not connect to primary #9788
Comments
Hi @vkovacik! What version of MySQL are you using here? It feels like a bug in MySQL to show In this case, assuming this is still an issue in the most recent MySQL versions, we would have to consider the lag to also be "unknown" if the Thank you for the great issue! |
@mattlord It's recent Percona MySQL 5.7.36-39
Yeah, I think when replication is checked by polling "SHOW SLAVE STATUS", besides checkingSQL_Delay with also need to check that both IO and SQL threads are running. I'm not sure with vanilla MySQL, but Percona 5.7 reports 0 slave lag when IO thread is not running. LInks: |
@vkovacik ok, thanks! MySQL and Percona are generally the same (MariaDB is not, after 5.5). Given that, I think the fix may be as simple as this:
We treat "Connecting" as equivalent to "Running" to avoid flapping on low traffic systems due to |
Yeah, I agree |
Overview of the Issue
Short description:
Replica tablet is serving reads even when replication IO thread is disconnected and replica data is stale for more than
unhealthy_threshold
.Long description:
We were seeing inconsistency between data returned from replica and data returned from primary tablet. After investigation I found that the replication on the replica is not connected to the primary but vtgate shows that the replica is actively participating in the reads (serving).
Affected shard:
Replication status on replica (las-0832024891):
Note that IO thread is not connected, SQL thread is running and reported replication lag is 0.
Serving state:
Screenshot from vtgate status:
Full replication status output:
Reproduction Steps
Binary Version
/vt/bin/vtgate --version Version: 13.0.0 (Git revision bc4a9606e1 branch 'heads/v13.0.0') built on Tue Feb 22 14:23:16 UTC 2022 by vitess@buildkitsandbox using go1.17.6 linux/amd64
Operating System and Environment details
Log Fragments
No response
The text was updated successfully, but these errors were encountered: