-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Add PIT/Scroll compatibility with Segment Replication #6644 #6765
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Failure tracked in #5329
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I threw in a few nitpicks but overall a nice solution and LGTM. Let's also javadoc at least the critical path logic along with the test workflow. I threw in some suggested docs you can change, but probably worth adding a bit more explanation for anyone to be able to track the thought process.
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Show resolved
Hide resolved
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Show resolved
Hide resolved
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Show resolved
Hide resolved
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Show resolved
Hide resolved
...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/engine/NRTReplicationEngine.java
Outdated
Show resolved
Hide resolved
@nknize @mch2 : Updated PR. Please have a look. Changes:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
The existing implementation results in occasional failure of testRelocateWhileContinuouslyIndexingAndWaitingForRefresh (also cause of last two gradle check failrues). The test times out while waiting for cluster health response post primary shard relocation. The timeout happens during engine reset while reading all IndexCommits from directory. The thread simply hangs without ever returning. It appears that files incRef'ed inside OpenSearchReaderManager fiddles with follow up (reset engine workflow) DirectoryReader.listCommits call. @nknize : Would you know cause of this flaky failure ? This flaky failure is blocking merge of this PR |
I observed that test hangs indefinitely while listing all IndexCommits only when there are non-compressed segment files ( Store content & files in latest commits For failing test
Passing test
For debugging purpose, I tried to move the |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #6765 +/- ##
============================================
- Coverage 70.78% 70.76% -0.02%
+ Complexity 59305 59296 -9
============================================
Files 4813 4824 +11
Lines 283781 284055 +274
Branches 40924 40955 +31
============================================
+ Hits 200864 201007 +143
- Misses 66420 66562 +142
+ Partials 16497 16486 -11
... and 485 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
This change makes updates to make PIT/Scroll queries compatibile with Segment Replication. It does this by refcounting files when a new reader is created, and discarding those files after a reader is closed. Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
…of current store state. Signed-off-by: Marc Handalian <handalm@amazon.com>
…close Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Gradle Check (Jenkins) Run Completed with:
|
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6765-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 4b7cb022d548f976911a55c58a37a758620b2ef6
# Push it to GitHub
git push --set-upstream origin backport/backport-6765-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
…cation opensearch-project#6644 (opensearch-project#6765) * Segment Replication - PIT/Scroll compatibility. This change makes updates to make PIT/Scroll queries compatibile with Segment Replication. It does this by refcounting files when a new reader is created, and discarding those files after a reader is closed. Signed-off-by: Marc Handalian <handalm@amazon.com> * Fix broken test. Signed-off-by: Marc Handalian <handalm@amazon.com> * Fix test bug with PIT where snapshotted segments are queried instead of current store state. Signed-off-by: Marc Handalian <handalm@amazon.com> * Address review comments and prevent temp file deletion during reader close Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix precommit failure Signed-off-by: Suraj Singh <surajrider@gmail.com> * Use last committed segment infos reference from replication engine Signed-off-by: Suraj Singh <surajrider@gmail.com> * Clean up and prevent incref on segment info file copied from primary Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix failing test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Marc Handalian <handalm@amazon.com> Signed-off-by: Suraj Singh <surajrider@gmail.com> Co-authored-by: Marc Handalian <handalm@amazon.com>
…cation (#7010) * [Segment Replication] Add PIT/Scroll compatibility with Segment Replication #6644 (#6765) * Segment Replication - PIT/Scroll compatibility. This change makes updates to make PIT/Scroll queries compatibile with Segment Replication. It does this by refcounting files when a new reader is created, and discarding those files after a reader is closed. Signed-off-by: Marc Handalian <handalm@amazon.com> * Fix broken test. Signed-off-by: Marc Handalian <handalm@amazon.com> * Fix test bug with PIT where snapshotted segments are queried instead of current store state. Signed-off-by: Marc Handalian <handalm@amazon.com> * Address review comments and prevent temp file deletion during reader close Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix precommit failure Signed-off-by: Suraj Singh <surajrider@gmail.com> * Use last committed segment infos reference from replication engine Signed-off-by: Suraj Singh <surajrider@gmail.com> * Clean up and prevent incref on segment info file copied from primary Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix failing test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Marc Handalian <handalm@amazon.com> Signed-off-by: Suraj Singh <surajrider@gmail.com> Co-authored-by: Marc Handalian <handalm@amazon.com> * Add param definition causing precommit failure Signed-off-by: Suraj Singh <surajrider@gmail.com> * Remove unnecessary override annotation Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Marc Handalian <handalm@amazon.com> Signed-off-by: Suraj Singh <surajrider@gmail.com> Co-authored-by: Marc Handalian <handalm@amazon.com>
Taken from #6644
Description
This change makes updates to make PIT/Scroll queries compatible with Segment Replication. It does this by refcounting files when a new reader is created, and discarding those files after a reader is closed.
Issues Resolved
#6519
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.