-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use CloseableRetryableRefreshListener to drain ongoing after refresh tasks on relocation #8683
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
…tasks on relocation Signed-off-by: Ashish Singh <ssashish@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #8683 +/- ##
============================================
+ Coverage 70.87% 70.95% +0.07%
- Complexity 57201 57252 +51
============================================
Files 4771 4772 +1
Lines 270312 270352 +40
Branches 39505 39513 +8
============================================
+ Hits 191590 191823 +233
+ Misses 62619 62414 -205
- Partials 16103 16115 +12
|
server/src/main/java/org/opensearch/index/shard/CloseableRetryableRefreshListener.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are we closing the listeners?
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Have added it now. |
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/index/shard/CloseableRetryableRefreshListener.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/CloseableRetryableRefreshListener.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/CloseableRetryableRefreshListener.java
Show resolved
Hide resolved
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/index/shard/CloseableRetryableRefreshListener.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/shard/RemoteStoreRefreshListener.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/index/shard/CheckpointRefreshListener.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/index/shard/CloseableRetryableRefreshListener.java
Show resolved
Hide resolved
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-8683-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 2ba1157947c84418234386ad5671719a99f4b889
# Push it to GitHub
git push --set-upstream origin backport/backport-8683-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
…tasks on relocation (opensearch-project#8683) * Use CloesableRetryableRefreshListener to drain ongoing after refresh tasks on relocation --------- Signed-off-by: Ashish Singh <ssashish@amazon.com>
…tasks on relocation (opensearch-project#8683) * Use CloesableRetryableRefreshListener to drain ongoing after refresh tasks on relocation --------- Signed-off-by: Ashish Singh <ssashish@amazon.com>
…tasks on relocation (opensearch-project#8683) * Use CloesableRetryableRefreshListener to drain ongoing after refresh tasks on relocation --------- Signed-off-by: Ashish Singh <ssashish@amazon.com> Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
…tasks on relocation (opensearch-project#8683) * Use CloesableRetryableRefreshListener to drain ongoing after refresh tasks on relocation --------- Signed-off-by: Ashish Singh <ssashish@amazon.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
RefreshListeners are async in nature and are triggered after segments are refreshed. Today, during relocation handoff, it is possible that the remote segments upload are happening from the older primary while the relocation has happened. Now, we are introducing a
CloseableRetryableRefreshListener
which will be extended byRemoteStoreRefreshListener
andCheckpointRefreshListener
. TheCloseableRetryableRefreshListener
has capabilities to be closed which guarantees that refreshes would not trigger any after refresh operations on these listeners once closed.In summary, the PR does the following -
CloseableRetryableRefreshListener
which has capabilities to be closed. It achieves the same by acquiring all available permits during close and leading to no further invocation ofvoid afterRefresh(boolean didRefresh)
method.CloseableRetryableRefreshListener
invokes theperformAfterRefresh(boolean didRefresh, boolean isRetry)
synchronously on the same calling thread. TheperformAfterRefresh
method returns true if the invocation was successful and otherwise false.CloseableRetryableRefreshListener
provides capabilities to schedule retry if the originalperformAfterRefresh
returns false. It would retry the sameperformAfterRefresh
after an interval returned by the implementor of theCloseableRetryableRefreshListener
abstract class.CloseableRetryableRefreshListener
also has constructs present internally to ensure that there are at max at a time no more than 1 retry scheduled for a future time. It also ensures that if theperformAfterRefresh
andretry
runs do not overlap by using semaphore permits.IndexShard
, therelocation
method has been updated to ensure that the refresh listeners are closed before the handoff which ensures consistency of the uploaded segments data.REMOTE_REFRESH
threadpool has been renamed toREMOTE_REFRESH_RETRY
.afterRefresh
method invocation in the RemoteStoreRefreshListener is synchronous as before but also happens on the originalREFRESH
threadpool.Related Issues
Resolves #8345
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.