-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
healthcheck: attempt to update primary only if the current tablet is serving #8121
Conversation
…serving Signed-off-by: deepthi <deepthi@planetscale.com>
Before logs:
After logs:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -200,7 +200,7 @@ func (thc *tabletHealthCheck) processResponse(hc *HealthCheckImpl, shr *query.St | |||
thc.setServingState(serving, reason) | |||
|
|||
// notify downstream for master change | |||
hc.updateHealth(thc.SimpleCopy(), prevTarget, trivialUpdate, true) | |||
hc.updateHealth(thc.SimpleCopy(), prevTarget, trivialUpdate, thc.Serving) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: thc.Serving
should not be false in practice (?) might want to add a warning log here when we change master to a tablet that is still down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be true or false, and this gets called for all tablet types. It is just that we only do anything with that value if the tablet_type is MASTER.
The healthy
list update code inside the call to updateHealth
is pretty solid, I think. It handles the true/false cases correctly. It will never replace with a down
master.
Does this address your concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah got it. so this code actually "incorrectly" always claim isPrimaryUp=true
in health update
this looks good
Description
Fixes #8120
Related Issue(s)
Checklist
Deployment Notes