Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can a bad state NATS cluster be bootstrapped with new config? #1277

Open
mnjdhl opened this issue Oct 29, 2022 · 5 comments
Open

Can a bad state NATS cluster be bootstrapped with new config? #1277

mnjdhl opened this issue Oct 29, 2022 · 5 comments

Comments

@mnjdhl
Copy link

mnjdhl commented Oct 29, 2022

My NATS cluster of 3 nodes is in bad state. Previously it was a 6-node cluster and recently it has been trimmed down to 3-node. The meta.json still shows the 6 members. The leader election fails as it expects 4 votes but gets only 3 votes, though other 3 staled nodes don't have NATS running. All the 3 active members are either in Follower or Candidate state. So attempting to remove the 3 staled nodes does not work.

Is there anyway to bootstrap this new cluster with the newer configuration?

@mnjdhl mnjdhl changed the title Can a bad state NATS cluster can be bootstrapped with new config? Can a bad state NATS cluster be bootstrapped with new config? Oct 29, 2022
@kozlovic
Copy link
Member

In order to help you better, would you please clarify if you are talking about NATS Streaming or JetStream? There is no meta.json in NATS Streaming...

But I believe that the solution would be quite similar: you need to start a 4th node to allow election (and then deletion of the old nodes).

@mnjdhl
Copy link
Author

mnjdhl commented Oct 31, 2022

I am talking about "NATS Streaming" and I have found the existence of meta.json in it.

@kozlovic
Copy link
Member

Again, not sure what that file is. Could you display the content of it? I do not believe that NATS Streaming is creating this file.

@mnjdhl
Copy link
Author

mnjdhl commented Oct 31, 2022

The following is the content of meta.json (located at log/442e19d3-faca-41e5-a32c-803a72d0e4ab/snapshots/12926-115914-1666846843907/ ):

{
  "Version": 1,
  "ID": "12926-115914-1666846843907",
  "Index": 115914,
  "Term": 12926,
  "Peers": "ltoAbjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYi40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWIuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFi2gBuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFiLmRlOWJjZTk3LWE4MGItNDBhNy04NWRmLTk1ZmI4YWYxNDAzYy40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWLaAG40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWIuZWE2NzY2ZmYtMTY3NC00MTI3LTkxZGQtOTk0YmFlNjgwNWFiLjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYtoAbjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYi5jZjEwNjgzZi1mOGY0LTRiODctOTZiYi00NjM2ZDk5MTRjNWMuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFi2gBuNDQyZTE5ZDMtZmFjYS00MWU1LWEzMmMtODAzYTcyZDBlNGFiLjRkN2FjMjdhLTRhZmQtNDM2My1iM2VhLWJlM2I2NThlZjRkZS40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWLaAG40NDJlMTlkMy1mYWNhLTQxZTUtYTMyYy04MDNhNzJkMGU0YWIuNDhiODdjNGEtYTA4Yy00YTkyLTk2YjQtMWU3MzZmNzIzNjk3LjQ0MmUxOWQzLWZhY2EtNDFlNS1hMzJjLTgwM2E3MmQwZTRhYg==",
  "Configuration": {
    "Servers": [
      {
        "Suffrage": 0,
        "ID": "442e19d3-faca-41e5-a32c-803a72d0e4ab",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.442e19d3-faca-41e5-a32c-803a72d0e4ab.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "de9bce97-a80b-40a7-85df-95fb8af1403c",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.de9bce97-a80b-40a7-85df-95fb8af1403c.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "ea6766ff-1674-4127-91dd-994bae6805ab",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.ea6766ff-1674-4127-91dd-994bae6805ab.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "cf10683f-f8f4-4b87-96bb-4636d9914c5c",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.cf10683f-f8f4-4b87-96bb-4636d9914c5c.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "4d7ac27a-4afd-4363-b3ea-be3b658ef4de",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.4d7ac27a-4afd-4363-b3ea-be3b658ef4de.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      },
      {
        "Suffrage": 0,
        "ID": "48b87c4a-a08c-4a92-96b4-1e736f723697",
        "Address": "442e19d3-faca-41e5-a32c-803a72d0e4ab.48b87c4a-a08c-4a92-96b4-1e736f723697.442e19d3-faca-41e5-a32c-803a72d0e4ab"
      }
    ]
  },
  "ConfigurationIndex": 3191,
  "Size": 66540,
  "CRC": "DwoakjxGDMM="
}

@kozlovic
Copy link
Member

I see. Well, my previous answer stands. Try to start an extra node so that quorum can be reached and a leader be elected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants