Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DeleteCellInfo(force=true) with downed local topo #9081

Merged
merged 3 commits into from
Oct 30, 2021

Conversation

ajm188
Copy link
Contributor

@ajm188 ajm188 commented Oct 25, 2021

Signed-off-by: Andrew Mason amason@slack-corp.com

Description

Implements the fix proposed in #8220. When GetSrvKeyspaceNames fails (for reasons other than NoNode) and the context expired, we assume it's because the local cell is shutdown, spin up a background context, and delete it anyway.

Manual testing

❯ vtctlclient -server "localhost:15999" AddCellInfo -server_address "localhost:99999" -root "/vitess/zone2" zone2
❯ vtctlclient -server ":15999" GetCellInfoNames
zone1
zone2
❯ vtctlclient -server "localhost:15999" DeleteCellInfo zone2
DeleteCellInfo Error: rpc error: code = Unknown desc = can't list SrvKeyspace entries in the cell; use -force flag to continue anyway (e.g. if cell-local topo was already permanently shut down): failed to create topo connection to localhost:99999, /vitess/zone2: context deadline exceeded
E1025 11:20:39.650251   35350 main.go:76] remote error: rpc error: code = Unknown desc = can't list SrvKeyspace entries in the cell; use -force flag to continue anyway (e.g. if cell-local topo was already permanently shut down): failed to create topo connection to localhost:99999, /vitess/zone2: context deadline exceeded
❯ vtctlclient -server "localhost:15999" DeleteCellInfo -force zone2
❯ vtctlclient -server ":15999" GetCellInfoNames
zone1

Related Issue(s)

Fixes #8220

Checklist

  • Should this PR be backported? no
  • Tests were added or are not required not possible with memorytopo
  • Documentation was added or is not required

Deployment Notes

Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.
Ideally we'd have a unit test for this. Do you think you could make one?

Copy link
Contributor Author

@ajm188 ajm188 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen Shot 2021-10-29 at 6 15 05 PM

Can't block my own PR, so I'm going to label this do-not-merge and work on that unit test!

Signed-off-by: Andrew Mason <amason@slack-corp.com>
Signed-off-by: Andrew Mason <amason@slack-corp.com>
…e local cell

Signed-off-by: Andrew Mason <amason@slack-corp.com>
@ajm188 ajm188 merged commit 2b40f13 into vitessio:main Oct 30, 2021
@ajm188 ajm188 deleted the deletecell-topodown-fix branch October 30, 2021 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

topo.Server DeleteCellInfo fails to delete if local topo is unreachable, even with force.
2 participants