Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cluster scale up - one pod not joining the cluster #419

Closed
aaronfern opened this issue Aug 29, 2022 · 1 comment
Closed

[BUG] Cluster scale up - one pod not joining the cluster #419

aaronfern opened this issue Aug 29, 2022 · 1 comment
Labels
kind/bug Bug status/closed Issue is closed (either delivered or triaged)

Comments

@aaronfern
Copy link
Contributor

Describe the bug:
When an etcd cluster is scaled up, there is a rare case where the third pod cannot join the cluster. Hence, we have only 2 members working is expected ways. What this results in is a cluster is size 2 with a learner that cannot join.
Although this means that the quorum is maintained, it also means that we lose a bit of resilience as one we are now just pod away from quorum loss.

The third pod complains of a cluster ID mismatch error and hence refuses to join the cluster. A possible cause of this is incorrect configuration passed to the etcd container before it is ready to join the cluster. The root cause however, is not very clear.

Expected behavior:
Cluster scale up to always succeed

How To Reproduce (as minimally and precisely as possible):
It is not yet clear how to consistently reproduce this. It is seen quite sporadically.

@abdasgupta
Copy link
Contributor

Closing this issue as the problem is resolved

@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Bug status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

No branches or pull requests

3 participants