Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Ensure replica's store always contains the previous commit point. #2551

Merged
merged 3 commits into from
Mar 25, 2022

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Mar 22, 2022

Description

This change:

  1. Updates the cleanup and validation steps on replicas after a replication event occurs to prevent deleting files that are still required by the latest commit point.
  2. Computes and sends replicas a list of recently merged away files that are still referenced by the primary's latest commit point. This allows replicas to fetch these files if not present locally. This can happen when a replica falls multiple commit points behind or if it starts after a primary has already existed and is in this state.
  3. Update initial recovery sequence of replicas to copy from primary before lighting up as active. This fixes bug where replicas could not be added after primary.
  4. Prevent replicas from attempting to a force merge if segrep is enabled.

Issues Resolved

closes #2331

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@mch2 mch2 requested a review from a team as a code owner March 22, 2022 00:24
@mch2 mch2 changed the title Ensure replica's store always contains the previous commit point. [Segment Replication] Ensure replica's store always contains the previous commit point. Mar 22, 2022
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 5dd980e1c68f1e97613fdeba73491e46b66f6ec0
Log 3649

Reports 3649

@dblock dblock requested review from kartg and Poojita-Raj March 22, 2022 14:09
This change:
1. Updates the cleanup and validation steps after a replication event occurs to prevent
deleting files still referenced by both the on disk segments_N file and the in memory SegmentInfos.
2. Sends metadata diff of on disk segments with each copy event. This allows replicas that are multiple commit points behind
to catch up.
3. Update initial recovery in IndexShard to copy segments before lighting up as active.  This fixes bug where replicas could not be added
after primary.

Signed-off-by: Marc Handalian <handalm@amazon.com>
@mch2
Copy link
Member Author

mch2 commented Mar 22, 2022

force push is a rebase from the feature branch.

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure d51a440
Log 3671

Reports 3671

segmentReplicationReplicaService.startReplication(
checkpoint,
this,
source,
new SegmentReplicationReplicaService.SegmentReplicationListener() {
@Override
public void onReplicationDone(SegmentReplicationState state) {
markReplicationComplete();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we remove this when we continue to use MarkAsReplicating()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to when ReplicationTarget's finalize method completes because I didn't like SegmentReplicationListener being responsible for marking it as complete. With that said I think we can do better by moving it to the onDone/onCancel/onFail methods of SegmentReplicationTarget. Will also move markAsReplicating to ReplicationTarget as well so these classes don't have to manage that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaned this up in latest commit, what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't the plan to move markAsReplicating to ReplicationTarget as well? Right now, that seems to be in ReplicaService so the lifecycle management is distributed across two classes, which smells. How do we improve this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe we can completely remove this state and make a synchronized check if ReplicationCollection has an ongoing replication for the shard. Would like to take this as a follow up change if thats ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create a task to track it, so we don't forget?

Signed-off-by: Marc Handalian <handalm@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 7d5ab35
Log 3684

Reports 3684

server/src/main/java/org/opensearch/index/store/Store.java Outdated Show resolved Hide resolved
Comment on lines 239 to 240
assert Transports.assertNotTransportThread(this + "[onFailure]");
logger.error("Failure", e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change in behavior?

Also, better log message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing responseListener's onFailure was getting triggered but doing nothing. This at least logs the error on the primary. I've updated to also send the exception back to the replica.

@kartg
Copy link
Member

kartg commented Mar 23, 2022

Outside of my review comments and the changes in this PR specifically, I wanted to put a high-level question - should we make the seg-rep tests more rigorous to validate the file structure on replica nodes to make sure they are consistent with the primary?

Our current asserts are around hit count, which seems largely sufficient given that our test case is checking if newly indexed documents show up on the replica. But it doesn't validate that the shards are point-in-time consistent

@mch2
Copy link
Member Author

mch2 commented Mar 24, 2022

Outside of my review comments and the changes in this PR specifically, I wanted to put a high-level question - should we make the seg-rep tests more rigorous to validate the file structure on replica nodes to make sure they are consistent with the primary?

Our current asserts are around hit count, which seems largely sufficient given that our test case is checking if newly indexed documents show up on the replica. But it doesn't validate that the shards are point-in-time consistent

Good call. I think we can do a decent comparison with the results of _cat/segments.

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 8ce2c1a9c78301f9208b5c6a028044d229ed41b2
Log 3740

Reports 3740

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure c7791601588393a1d25388acc286985a824e3281
Log 3743

Reports 3743

- Updated TrackShardRequestHandler to send error case back to replicas.
- Renamed additionalFiles to pendingDeleteFiles in TransportCheckpointInfoResponse.
- Refactored Store.cleanupAndVerify methods to remove duplication.

Signed-off-by: Marc Handalian <handalm@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure f0410ac5b3b8c13bba3ffbe240241ccc6d3bcf69
Log 3749

Reports 3749

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 169cfc4
Log 3750

Reports 3750

@mch2 mch2 merged commit 9bcee79 into opensearch-project:feature/segment-replication Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants