-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error checking in ERS #9330
Conversation
@harshit-gangal @deepthi To fix the issue currently we have changed the type of There is another approach that we can take. For now in the mysql package we have the constants for the mysql error numbers. We can add a new error number representing Pro - Approach is easier to understand by looking at the code and would eliminate the substring comparison that we have. WDYT? |
3fda4a6
to
2046330
Compare
@GuptaManan100 upon further thought, I think this is an important enough error case that we should add a new error code (while ensuring that it doesn't collide with MySQL's error codes). There's enough gaps in the MySQL error codes space to add a section of Vitess-specific errors -- keeping them close together will allow us to not blur the difference between MySQL errors and ours. |
Fixing the bug in ERS has led to a new issue. Consider the following scenario -
There doesn't seem to be a way to fix this problem without further deliberation, since we can't wait on all the n tablets too, if we did then if the primary was actually down, we would exhaust all our time waiting for it to return! |
Solution for the above problem is to always wait for the newPrimary that the user has asked for and any other n-2 tablets. Proceeding with this change |
Signed-off-by: Manan Gupta <manan@planetscale.com>
…hich have no replication status Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
…ewly introduced sql error number Signed-off-by: Manan Gupta <manan@planetscale.com>
…loser to as though it failed Signed-off-by: Manan Gupta <manan@planetscale.com>
…e status map Signed-off-by: Manan Gupta <manan@planetscale.com>
2046330
to
d72d214
Compare
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine overall, but maybe a few small changes needed.
Good work 💯
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Description
The
StopReplicationAndBuildStatusMaps
function calls theStopReplicationAndGetStatus
rpc and proceeds to check whether the received error isErrNotReplica
. We use this information to know that the tablet in question considered itself to be the primary or not.However, we should not directly compare the error pointers since we receive an RPC error and the original error we sent from the vttablet. This PR changes one of the test to send the correct error message and fixes the underlying problem.
Also, as part of this PR, the two constant errors have been moved to the vterrors package from the mysql package.
Related Issue(s)
Checklist
Deployment Notes