-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync #13348
Conversation
fbd2a0c
to
32bfadb
Compare
tx.UnsafePut(buckets.MembersRemoved, mkey, []byte("removed")) | ||
if !unsafeMemberExists(tx, mkey) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we document this logic? So, we "dont'" delete the "mkey" from "Members" bucket, and expect "mkey" does not exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that deletion of mkey
was moved to line 84. Its just not needed if its not found.
But +1 to providing doc to this and other methods about errors being reported.
Found one issue when running integration tests. Looks like current implementation breaks TestV3HashRestart test. This test starts etcd, reads DB hash, restarts it, reads the hash again and compare if there was difference. I found that current implementation with v2store overriding db members breaks this assumption. |
f30e181
to
a564fa5
Compare
…ckend out of sync
Decided to drop syncing during Recover call. Updated PR description to provide the reasoning. |
How about syncing on StoreV2->Backend just before storeV2 snapshot operation ? |
By default snapshot happens only 10000 wal entries. I'm worried it might complicate and obfuscate things even more. I'm little scared about using StoreV2-<backend sync. It's a internal change that is not visible in WAL and not tied to any consistent index value. This means it has user visible impact (changes hash of data) without user interaction. Currently proposed fix just to AddMember/Remove member commands will not resolve corrupted state automatically (will require user to notice and delete/add members themselves). However it gives user full visibility in what is happening. Would love to get some opinion from @gyuho or @hexfusion |
Thank you. Merging as the discussion is related to a potential next step. I think we should have either automated 'fix' process or a fix process that requires issuing a single command to bring the stores to sync (independently on number of out-of-sync members). |
Fix for v3.5 Ensure that cluster members stored in v2store and backend are in sync
… on v2 store. Migration off v2 store was unfinished during v3.5 development etcd-io#12914 leaving RaftCluster broken, half migrated. This was found in v3.5 and fixed in etcd-io#13348, however this is was not patched on main branch. This PR makes v3.6 RaftCluster consistent with v3.5 so we can cleanly migrate off v2 store. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
As discussed in #13196 there is a chance that between v3.1 and v3.4 state of bbolt and v2store diverged. This is only noticeable after we upgrade to v3.5 as the authoritative storage was changed from v2store to bbolt.
This PR is a direct fix this problem in v3.5. This is not cherry-pick from master as v3.6 plans to remove v2store totally. Current fix is designed to work assuming that users uses v3.5.1 before upgrading to v3.6. This approach was picked to avoid delaying removal of v2storage in v3.6 (discussed more on original issue).
What this PR does:
When etcd bootstraps, RaftCluster Recover will sync contents of backend to match v2store. This operation is always correct and it should be cheap as number of cluster members should be around 3-5, and historically it should not grow beyond tens. This should fix problem of zombie members appearing after upgrade to v3.5.Resigned from implementing this as during Recover it's expected that v2store and backend are out of sync. Reason is that Recover is called during member bootstrap before applying WAL, as storev2 and backend are persisted at different time (storev2 only after snapshot, backend every 5 seconds) .cc @ptabor