Skip to content

Commit 0cfc9ff

Browse files
authored
Sync global checkpoint on pending in-sync shards (#43526)
At the end of a peer recovery the primary wants to mark the replica as in-sync. For that the persisted local checkpoint of the replica needs to have caught up with the global checkpoint on the primary. If translog durability is set to ASYNC, this means that information about the persisted local checkpoint can lag on the primary and might need to be explicitly fetched through a global checkpoint sync action. Unfortunately, that action will only be triggered after 30 seconds, and, even worse, will only run based on what the in-sync shard copies say (see IndexShard.maybeSyncGlobalCheckpoint). As the replica has not been marked as in-sync yet, it is not taken into consideration, and the primary might have its global checkpoint equal to the max seq no, so it thinks nothing needs to be done. Closes #43486
1 parent 47d8131 commit 0cfc9ff

File tree

3 files changed

+6
-4
lines changed

3 files changed

+6
-4
lines changed

server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1069,7 +1069,7 @@ private Runnable getMasterUpdateOperationFromCurrentState() {
10691069
}
10701070

10711071
/**
1072-
* Whether the are shards blocking global checkpoint advancement. Used by tests.
1072+
* Whether the are shards blocking global checkpoint advancement.
10731073
*/
10741074
public synchronized boolean pendingInSync() {
10751075
assert primaryMode;

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2134,9 +2134,11 @@ public void maybeSyncGlobalCheckpoint(final String reason) {
21342134
final long globalCheckpoint = replicationTracker.getGlobalCheckpoint();
21352135
// async durability means that the local checkpoint might lag (as it is only advanced on fsync)
21362136
// periodically ask for the newest local checkpoint by syncing the global checkpoint, so that ultimately the global
2137-
// checkpoint can be synced
2137+
// checkpoint can be synced. Also take into account that a shard might be pending sync, which means that it isn't
2138+
// in the in-sync set just yet but might be blocked on waiting for its persisted local checkpoint to catch up to
2139+
// the global checkpoint.
21382140
final boolean syncNeeded =
2139-
(asyncDurability && stats.getGlobalCheckpoint() < stats.getMaxSeqNo())
2141+
(asyncDurability && (stats.getGlobalCheckpoint() < stats.getMaxSeqNo() || replicationTracker.pendingInSync()))
21402142
// check if the persisted global checkpoint
21412143
|| StreamSupport
21422144
.stream(globalCheckpoints.values().spliterator(), false)

test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1183,7 +1183,7 @@ private void assertNoPendingIndexOperations() throws Exception {
11831183
}
11841184
}
11851185
}
1186-
});
1186+
}, 60, TimeUnit.SECONDS);
11871187
}
11881188

11891189
private void assertOpenTranslogReferences() throws Exception {

0 commit comments

Comments
 (0)