-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Concurrent Snapshot Ending And Stabilize Snapshot Finalization #38368
Fix Concurrent Snapshot Ending And Stabilize Snapshot Finalization #38368
Conversation
Pinging @elastic/es-distributed |
@@ -680,14 +692,27 @@ public void applyClusterState(ClusterChangedEvent event) { | |||
try { | |||
if (event.localNodeMaster()) { | |||
// We don't remove old master when master flips anymore. So, we need to check for change in master | |||
if (event.nodesRemoved() || event.previousState().nodes().isLocalNodeElectedMaster() == false) { | |||
processSnapshotsOnRemovedNodes(event); | |||
final SnapshotsInProgress snapshotsInProgress = event.state().custom(SnapshotsInProgress.TYPE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified the logic here a little to avoid the endless null
check nestings that make it really hard to figure out what line of conditions led to something being executed.
// 1. Completed snapshots | ||
// 2. Snapshots in state INIT that the previous master failed to start | ||
// 3. Snapshots in any other state that have all their shard tasks completed | ||
snapshotsInProgress.entries().stream().filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All snapshot ending happens here now.
- This should prevent any future stale snapshots that have all their shards completed.
- Makes it much easier to reason about master failovers.
*/ | ||
private void removeFinishedSnapshotFromClusterState(ClusterChangedEvent event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now automatically covered by the applyClusterState
hook
} | ||
} | ||
entries.add(updatedSnapshot); | ||
} else if (snapshot.state() == State.INIT && initializingSnapshots.contains(snapshot.snapshot()) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be more stable and easier to reason about. It's weird that we check newMaster
on some version of the state and then "later" on run this code based on whether or not we failed over earlier.
return false; | ||
private static boolean removedNodesCleanupNeeded(SnapshotsInProgress snapshotsInProgress, List<DiscoveryNode> removedNodes) { | ||
// If at least one shard was running on a removed node - we need to fail it | ||
return removedNodes.isEmpty() == false && snapshotsInProgress.entries().stream().flatMap(snapshot -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be way simplified now too since we're already cleaning up snapshots in SUCCESS
and INIT
state at the top level of applyClusterState
.
* @param failure failure reason or null if snapshot was successful | ||
*/ | ||
private void endSnapshot(final SnapshotsInProgress.Entry entry, final String failure) { | ||
private void endSnapshot(final SnapshotsInProgress.Entry entry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one private method now, the potential failure message lives in the cluster state.
SnapshotsStatusResponse status = | ||
client.admin().cluster().prepareSnapshotStatus("repository").setSnapshots("snap").get(); | ||
assertThat(status.getSnapshots().iterator().next().getState(), equalTo(State.ABORTED)); | ||
} catch (Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessary anymore, we'll never create a broken repository with this fix.
@@ -156,9 +154,6 @@ public void clusterChanged(ClusterChangedEvent event) { | |||
logger.info("--> got exception from race in master operation retries"); | |||
} else { | |||
logger.info("--> got exception from hanged master", ex); | |||
assertThat(cause, instanceOf(MasterNotDiscoveredException.class)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timing here changed now and we're running into
[2019-02-05T09:27:33,492][INFO ][o.e.d.SnapshotDisruptionIT] [testDisruptionOnSnapshotInitialization] --> got exception from hanged master
java.util.concurrent.ExecutionException: RemoteTransportException[[node_tm0][127.0.0.1:46407][cluster:admin/snapshot/create]]; nested: InvalidSnapshotNameException[[test-repo:test-snap-2] Invalid snapshot name [test-snap-2], snapshot with the same name already exists];
in most cases from the retries on the hanged master. I relaxed the assertion as we did elsewhere for this case.
Jenkins run elasticsearch-ci/2 |
@ywelsch I tried to fix this in a shorter manner (i.e. without having to make wire protocol changes to
Take a look when you have a chance (diff isn't so large with whitespaces ignored :)). |
test failure is due to #38412 |
server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java
Outdated
Show resolved
Hide resolved
@ywelsch thanks! |
* Backport of various snapshot stability fixes from `master` to `6.7` * Includes elastic#38368, elastic#38025 and elastic#37612
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
- Fix two races condition that lead to stuck snapshots (elastic/elasticsearch#37686) - Improve resilience SnapshotShardService (elastic/elasticsearch#36113) - Fix concurrent snapshot ending and stabilize snapshot finalization (elastic/elasticsearch#38368)
endSnapshot
were made concurrently, leading to non-deterministic behavior (beginSnapshot
was triggering a repository finalization while one that was triggered by adeleteSnapshot
was already in progress)endSnapshot
calls originate from the cluster state being in a "completed" state (apart from on short-circuit on initializing an empty snapshot). This forced putting the failure string intoSnapshotsInProgress.Entry
.endSnapshot
SnapshotsService
to decide which snapshot entries are stale)Note: I ran a few thousand iterations of the
SnapshotResiliencyTests
for these changes and they came back green,