Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [Segment Replication] Update store metadata recovery diff logic to ignore missing files causing exception #4185

Merged
merged 3 commits into from
Aug 12, 2022

Conversation

dreamer-89
Copy link
Member

@dreamer-89 dreamer-89 commented Aug 10, 2022

Description

This PR changes the logic to ignore missing files which currently causes exception on replica. This issue happens when primary performs delete/update operations which generates extra files for older segment group. This PR also adds a unit test which exercises this code by generating actual segment group files.

Existing logic

for (List<StoreFileMetadata> segmentFiles : getGroupedFilesIterable()) {
 boolean consistent = true;
 for (StoreFileMetadata meta : segmentFiles) {
   StoreFileMetadata storeFileMetadata = recoveryTargetSnapshot.get(meta.name());
    if (storeFileMetadata == null) {
        consistent = false;  --> missing files (e.g. `*.liv`) makes existing segment group inconsistent
         missing.add(meta);
     } else if (storeFileMetadata.isSame(meta) == false) {
         consistent = false;
         different.add(meta);
      } else {
         identicalFiles.add(meta);
       }
     }
     if (consistent) {
          identical.addAll(identicalFiles); 
     } else {
        // make sure all files are added - this can happen if only the deletes are different
        different.addAll(identicalFiles); --> missing files causes identical files added to different list causing failures
     }

Related

#4117

Issues Resolved

#3787

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Poojita-Raj and others added 2 commits August 10, 2022 11:23
Signed-off-by: Poojita Raj <poojiraj@amazon.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
…n replica during copy

Signed-off-by: Suraj Singh <surajrider@gmail.com>
@dreamer-89 dreamer-89 requested review from a team and reta as code owners August 10, 2022 22:03
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@Poojita-Raj
Copy link
Contributor

 if (consistent) {
      identical.addAll(identicalFiles); 
 } else {
    // make sure all files are added - this can happen if only the deletes are different
    different.addAll(identicalFiles); --> missing files added to different list causing failures
 }

This should be -> identical files added to different list causing failure on check that files are unchanged i.e., different is empty.

* <li>missing: files that exist in the source but not in the target</li>
* </ul>
*/
public RecoveryDiff segmentReplicationDiff(MetadataSnapshot recoveryTargetSnapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - rename snapshot as well for consistency - segmentReplicationTargetSnapshot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Poojita-Raj for the review.

This method evaluates the difference in segment files b/w different store metadata snapshots which makes this naming consistent. This name also aligns with existing recoveryDiff used for evaluating snapshots diff.

@@ -247,6 +249,56 @@ public void testStartReplicaAfterPrimaryIndexesDocs() throws Exception {
assertSegmentStats(REPLICA_COUNT);
}

public void testDeleteOperations() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this integ test and it still gives an off by 1 error intermittently so I think we should either hold off on it for a separate PR or fix it before merging it in.

image

Copy link
Member Author

@dreamer-89 dreamer-89 Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good call-out @Poojita-Raj . Yes, I am also able to repro this by running test in interations (fails 3/10). I think this is related to replica lagging behind primary and needs some deep dive. This PR is about avoid shard failures due to delete operations. I will file a new bug to track this.

./gradlew ':server:internalClusterTest' --tests "org.opensearch.indices.replication.SegmentReplicationIT.testDeleteOperations" -Dtests.seed=44620B458370EA29 -Dtests.iters=10

* <li>missing: files that exist in the source but not in the target</li>
* </ul>
*/
public RecoveryDiff segmentReplicationDiff(MetadataSnapshot recoveryTargetSnapshot) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this method seems identical to recoveryDiff except for how missing files are handled. Instead of duplicating most of the code, I would suggest gating the two pieces of logic with a boolean method parameter like:

public RecoveryDiff recoveryDiff(MetadataSnapshot recoveryTargetSnapshot, boolean shouldCheckMissingFiles) {

That would also remove the need to extract the getGroupedFilesIterable method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kartg, this is a good point! Initially, I thought of adding a flag for gating but later realized recoveryDiff is used heavily and needs lot many changes. Also, I feel keeping them separate clearly calls out for separate functionality and leaves scope for future enhancements around store's metadata comparison (for e.g. identical/missing files are not used for segment comparison and can be removed altogether) and makes it more readable.

@@ -98,6 +122,13 @@ public void setUp() throws Exception {
);
}

private Settings getIndexSettings() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - any reason why this was extracted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is not required. Undoing this change.

*/
List<Store.MetadataSnapshot> generateStoreMetadataSnapshot(int docCount) throws IOException {
List<Document> docList = new ArrayList<>();
for (int i = 0; i < 10; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - should the limit of this loop be docCount rather than 10 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, updating it.

*/
public void test_MissingFiles_NotCausingFailure() throws IOException {
int docCount = 1 + random().nextInt(10);
List<Store.MetadataSnapshot> storeMetadataSnapshots = generateStoreMetadataSnapshot(docCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick - please add comments here so that it's clear what the subsequent get(0) and get(1) fetch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Suraj Singh <surajrider@gmail.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #4185 (9350ba6) into main (a469a3c) will increase coverage by 0.08%.
The diff coverage is 82.70%.

@@             Coverage Diff              @@
##               main    #4185      +/-   ##
============================================
+ Coverage     70.59%   70.68%   +0.08%     
- Complexity    57083    57143      +60     
============================================
  Files          4603     4604       +1     
  Lines        274551   274596      +45     
  Branches      40210    40216       +6     
============================================
+ Hits         193831   194098     +267     
+ Misses        64514    64284     -230     
- Partials      16206    16214       +8     
Impacted Files Coverage Δ
...ava/org/opensearch/client/RestHighLevelClient.java 44.32% <ø> (-0.16%) ⬇️
...gregations/metrics/GeoBoundsAggregatorFactory.java 88.88% <ø> (ø)
...search/aggregations/metrics/InternalGeoBounds.java 62.96% <ø> (ø)
...o/search/aggregations/metrics/ParsedGeoBounds.java 88.00% <ø> (ø)
.../main/java/org/opensearch/search/SearchModule.java 96.27% <ø> (-0.03%) ⬇️
...earch/search/aggregations/AggregationBuilders.java 46.15% <ø> (+1.15%) ⬆️
...regations/support/AggregationInspectionHelper.java 53.65% <ø> (+1.27%) ⬆️
...g/opensearch/test/InternalAggregationTestCase.java 98.21% <ø> (-0.46%) ⬇️
...a/org/opensearch/test/OpenSearchIntegTestCase.java 57.48% <ø> (ø)
...regations/metrics/AbstractGeoBoundsAggregator.java 56.52% <56.52%> (ø)
... and 498 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@dreamer-89 dreamer-89 merged commit b9a64cb into opensearch-project:main Aug 12, 2022
dreamer-89 added a commit to dreamer-89/OpenSearch that referenced this pull request Aug 12, 2022
… to ignore missing files causing exception (opensearch-project#4185)

* Update Store for segment replication dif

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Update recoveryDiff logic to ingore missing files causing exception on replica during copy

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Co-authored-by: Poojita Raj <poojiraj@amazon.com>
Rishikesh1159 pushed a commit to Rishikesh1159/OpenSearch that referenced this pull request Aug 17, 2022
… to ignore missing files causing exception (opensearch-project#4185)

* Update Store for segment replication dif

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Update recoveryDiff logic to ingore missing files causing exception on replica during copy

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Co-authored-by: Poojita Raj <poojiraj@amazon.com>
Rishikesh1159 added a commit that referenced this pull request Aug 17, 2022
…aining segment replication changes (#4243)

* [Segment Replication] Checkpoint Replay on Replica Shard (#3658)

* Adding Latest Recevied checkpoint, replay checkpoint logic along with tests

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* [Segment Replication] Wire up segment replication with peer recovery and add ITs. (#3743)

* Add null check when computing max segment version.

With segment replication enabled it is possible Lucene does not set the SegmentInfos
min segment version, leaving the default value as null.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Update peer recovery to set the translogUUID of replicas to the UUID generated on the primary.

This change updates the UUID when the translog is created to the value stored in the passed segment userdata.
This is to ensure during failover scenarios that the replica can be promoted and not have a uuid mismatch with the value stored in user data.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Wire up Segment Replication under the feature flag.

This PR wires up segment replication and adds some initial integration tests.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Add test to ensure replicas use primary translog uuid with segrep.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Update SegmentReplicationIT to assert previous commit points are valid and SegmentInfos can be built.
Fix nitpicks in PR feedback.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix test with Assert.fail to include a message.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Disable shard idle with segment replication. (#4118)

This change disables shard idle when segment replication is enabled.
Primary shards will only push out new segments on refresh, we do not want to block this based on search behavior.
Replicas will only refresh on an externally provided SegmentInfos, so we do not want search requests to hang waiting for a refresh.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix isAheadOf logic for ReplicationCheckpoint comparison (#4112)

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Fix waitUntil refresh policy for segrep enabled indices. (#4137)

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Add IndexShard#getLatestReplicationCheckpoint behind segrep enable feature flag (#4163)

* Add IndexShard#getLatestReplicationCheckpoint behind segrep enable feature flag

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comment. Move tests to SegmentReplicationIndexShardTests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Add segrep enbaled index settings in TargetServiceTests, SourceHandlerTests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* [Segment Replication] Fix OngoingSegmentReplications to key by allocation ID instead of DiscoveryNode. (#4182)

* Fix OngoingSegmentReplications to key by allocation ID instead of DiscoveryNode.

This change fixes segrep to work with multiple shards per node by keying ongoing replications on
allocation ID.  This also updates cancel methods to ensure state is properly cleared on shard cancel.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Clean up cancel methods.

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Marc Handalian <handalm@amazon.com>

* [Bug] [Segment Replication] Update store metadata recovery diff logic to ignore missing files causing exception (#4185)

* Update Store for segment replication dif

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>

* Update recoveryDiff logic to ingore missing files causing exception on replica during copy

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Co-authored-by: Poojita Raj <poojiraj@amazon.com>

* [Segment Replication] Adding PrimaryMode check before publishing checkpoint and processing a received checkpoint. (#4157)

* Adding PrimaryMode check before publishing checkpoint.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Applying spotless check

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Moving segrep specific tests to SegmentReplicationIndexShardTests.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Adding logic and tests for rejecting checkpoints if shard is in PrimaryMode.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Applying ./gradlew :server:spotlessApply.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Applying ./gradlew :server:spotlessApply

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Changing log level to warn in shouldProcessCheckpoint() of IndexShard.java class.

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* Removing unnecessary lazy logging in shouldProcessCheckpoint().

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>

* [Segment Replication] Wait for documents to replicate to replica shards (#4236)

* [Segment Replication] Add thread sleep to account for replica lag in delete operations test

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments, assertBusy on doc count rather than sleep

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Segment Replication - Remove unnecessary call to markAllocationIdAsInSync. (#4224)

This PR Removes an unnecessary call to markAllocationIdAsInSync on the primary shard when replication events complete.
Recovery will manage this initial call.

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Segment Replication - Add additional unit tests for update & delete ops. (#4237)

* Segment Replication - Add additional unit tests for update & delete operations.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless.

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Rishikesh1159 <rishireddy1159@gmail.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Poojita Raj <poojiraj@amazon.com>
Co-authored-by: Marc Handalian <handalm@amazon.com>
Co-authored-by: Suraj Singh <surajrider@gmail.com>
Co-authored-by: Poojita Raj <poojiraj@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants