From 712cb71fb56faa16984eff5a754f2053444a83b2 Mon Sep 17 00:00:00 2001 From: Marc Handalian Date: Tue, 6 Sep 2022 10:06:24 -0700 Subject: [PATCH] Segment Replication - Fix NoSuchFileException errors caused when computing metadata snapshot on primary shards. (#4366) * Segment Replication - Fix NoSuchFileException errors caused when computing metadata snapshot on primary shards. This change fixes the errors that occur when computing metadata snapshots on primary shards from the latest in-memory SegmentInfos. The error occurs when a segments_N file that is referenced by the in-memory infos is deleted as part of a concurrent commit. The segments themselves are incref'd by IndexWriter.incRefDeleter but the commit file (Segments_N) is not. This change resolves this by ignoring the segments_N file when computing metadata for CopyState and only sending incref'd segment files to replicas. Signed-off-by: Marc Handalian * Fix spotless. Signed-off-by: Marc Handalian * Update StoreTests.testCleanupAndPreserveLatestCommitPoint to assert additional segments are deleted. Signed-off-by: Marc Handalian * Rename snapshot to metadataMap in CheckpointInfoResponse. Signed-off-by: Marc Handalian * Refactor segmentReplicationDiff method to compute off two maps instead of MetadataSnapshots. Signed-off-by: Marc Handalian * Fix spotless. Signed-off-by: Marc Handalian * Revert catchall in SegmentReplicationSourceService. Signed-off-by: Marc Handalian * Revert log lvl change. Signed-off-by: Marc Handalian * Fix SegmentReplicationTargetTests Signed-off-by: Marc Handalian * Cleanup unused logger. Signed-off-by: Marc Handalian Signed-off-by: Marc Handalian Co-authored-by: Suraj Singh --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9672227109587..7cf86eaf4ff37 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -53,6 +53,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) - [Segment Replication] Extend FileChunkWriter to allow cancel on transport client ([#4386](https://github.com/opensearch-project/OpenSearch/pull/4386)) - [Segment Replication] Fix NoSuchFileExceptions with segment replication when computing primary metadata snapshots ([#4366](https://github.com/opensearch-project/OpenSearch/pull/4366)) - [Segment Replication] Fix timeout issue by calculating time needed to process getSegmentFiles ([#4434](https://github.com/opensearch-project/OpenSearch/pull/4434)) +- [Segment Replication] Update replicas to commit SegmentInfos instead of relying on segments_N from primary shards. ### Security