-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Segment Replication BUG] Replica shard fails during segment replication during indexing / bulk indexing calls #4129
Comments
@dreamer-89 Yeah this should not be failing the replica, it would catch up to the new cp after the current replication event completes. I think this is happening bc we are mapping DiscoveryNode > SegmentReplicationSourceHandler here: if (nodesToHandlers.putIfAbsent(
request.getTargetNode(),
createTargetHandler(request.getTargetNode(), copyState, fileChunkWriter)
) != null) {
throw new OpenSearchException(
"Shard copy {} on node {} already replicating",
request.getCheckpoint().getShardId(),
request.getTargetNode()
);
} This needs to be mapped by allocation ID, not the node. edit - just realized your repro instructions are running with 1 shard, but this would be a problem with multiple replicas on a single node. I think there is also buggy logic here with how we are handling replication in the source service. The source service keeps track of which replicas are actively copying by creating a However, to clear that I think the source service should also be able to handle two calls to |
Have opened #4182 to cover moving this to allocationID over node. I have not been able to repro after applying this change but I think we should leave this open to explore more enhancements. |
closing this one bc I haven't seen it since, please reopen if needed. |
Describe the bug
When doing indexing (single/bulk), the index becomes yellow when segment replication kicks in. The replica shard is failing and then recovery kicks in which makes the index green again. Seeing the below error in the logs -
To Reproduce
Steps to reproduce the behavior:
curl -X PUT "localhost:9200/test2?pretty" -H 'Content-Type: application/json' -d' { "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 1, "replication": {"type": "SEGMENT"} }}}'
Expected behavior
A clear and concise description of what you expected to happen.
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: