-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication] Draft - Create Segrep orchestration class & Target implementation #3409
Conversation
❌ Gradle Check failure 8c9eaae84079a989504c2f597fe91325368dd00c |
❌ Gradle Check failure 5c5340263f4fb9148cf53a50b4ce41f997f487fb |
❌ Gradle Check failure a006336c4472ce212f63b32ad423f631ee60a9ba |
❌ Gradle Check failure d562fa0274b930338afd27fbb1a6c4c7d6d7e7b5 |
✅ Gradle Check success 18eced14aa7894505f19510f0bd14c9284e502a3 |
server/src/main/java/org/opensearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/CheckpointInfoResponse.java
Outdated
Show resolved
Hide resolved
// How many bytes we've copied since we last called RateLimiter.pause | ||
private final AtomicLong bytesSinceLastPause = new AtomicLong(); | ||
private final RecoverySettings recoverySettings; | ||
private final ReplicationCollection<T> onGoingTransfers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why operate on the collection? Why not operate on a single ReplicationTarget
?
RateLimiter rateLimiter = recoverySettings.rateLimiter(); | ||
if (rateLimiter != null) { | ||
long bytes = bytesSinceLastPause.addAndGet(request.content().length()); | ||
if (bytes > rateLimiter.getMinPauseCheckBytes()) { | ||
// Time to pause | ||
bytesSinceLastPause.addAndGet(-bytes); | ||
long throttleTimeInNanos = rateLimiter.pause(bytes); | ||
indexState.addTargetThrottling(throttleTimeInNanos); | ||
replicationTarget.indexShard().recoveryStats().addThrottleTime(throttleTimeInNanos); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the only stateful code in this class. How about moving it to a decorator and creating new FileChunkRequestHandler
instances for each new request?
server/src/main/java/org/opensearch/indices/recovery/GetCheckpointInfoRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/GetFilesRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/GetFilesRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/GetFilesResponse.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/SegmentReplicationTransportRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/recovery/RetryableTransportClient.java
Outdated
Show resolved
Hide resolved
❌ Gradle Check failure 3177f87d00a76f54f198f016b3f09d475614dacf |
983fcde
to
82209ce
Compare
This change introduces target and source services for Segment Replication. It also refactors PeerRecoveryTargetService and RemoteRecoveryTargetHandler to reuse a FileChunkRequestHandler and transport client that issues retryable requests. Signed-off-by: Marc Handalian <handalm@amazon.com>
Removed any coupling to Recovery settings. Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
…the handler class. Signed-off-by: Marc Handalian <handalm@amazon.com>
❌ Gradle Check failure 983fcde4ee705bd5e5c85cebcd3d4fc5c24d4734 |
Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
✅ Gradle Check success 82209ce47a63a8e9b133784c900d6f2425b5d419 |
❌ Gradle Check failure a09d24d4940fd817b845fc76a732a77215696602 |
…arget. Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
…eplicationSource. Signed-off-by: Marc Handalian <handalm@amazon.com>
Cutting this up into multiple PRs, starting with #3439 |
Description
This is a draft to show how the refactors introduce in #3234 are intended to be used. This is intended for illustration purposes right now, it needs tests & some package reorganization.
This PR introduces components that orchestrate the flow for Segment Replication events.
SegmentReplicationTargetService
- orchestration component for replication events. Registers a request handler to handle file chunk requests for segrep.SegmentReplicationSource
- Interface that will provide checkpoint data & copy segment files.PeerReplicationSource
- implementation of SegmentReplicationSource where the source is another shard.Along with some stubbed components that will be implemented in upcoming PRs:
SegmentReplicationTarget
- ReplicationTarget implementation for segrep, will handle invoking methods on a provided source.SegmentReplicationSourceService
- Service that handles requests fromPeerReplicationSource
.SegmentReplicationState
State object to track replication events.Additional Refactoring:
RecoveryFileChunkRequest
toFileChunkRequest
and move throttling logic to ReplicationTarget to be reused.Issues Resolved
[List any issues this PR will resolve]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.