-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backups intermittently fail #2588
Comments
Pavel, do you have time to look into this? |
It would also be great if we can have some kind of alerting mechanism to or and API of some kind to show that backups have failed |
I don't understand this issue. The first log excerpt shows how mysqld was shutdown and backup has started. I don't see any failure message. The second log excerpt doesn't show this part, but looks like it's showing how mysqld is started after the backup and how the RPC TabletManager.Backup is succeeded. |
This was a problem on our infrastructure end. Since, during backups, health checks show unhealthy, after certain time, tablets were getting restarted causing backups to fail. we have fixed this by changing our health checks to not use /health but use TabletState from /debug/vars as health check |
* backport of 2588 * muysqlctl: Ensure to implement server side API Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> --------- Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>
If this errors, we can't derefence v and read this value. This was broken in vitessio#13449 Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Following is the lock when the backups fail:
Following is the successful backup log:
As a workaround, we have currently added retry for backups in case of failures, but the failures are extremely intermittent at this point.
It almost feels like there is a clean shutdown being triggered during backups.
The text was updated successfully, but these errors were encountered: