-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjusting the number of replicas can remove valid shard copies from the in-sync set #21719
Comments
I am not sure if this is the write place to put this I did following operations (I don't remember the order of these) "Added new node" I also had changed node awareness in between (disabled by setting awareness attribute to I ended up with unassigned shards even after reverting all cluster updates I did before. I waited for almost an hour, some shards didn't get allocated. I manually pinged I had to restart one of the node where those unassigned shard data is there on disk. Even then some indices are still in unassigned state. Routing information for the shard that didn't get allocated even after restarting
curl -XGET 'http://localhost:9200/_shard_stores?pretty'
|
Can you also provide the output of
As workaround, if you're absolutely sure that only this one node has data for this shard (i.e. all nodes were available while running the |
I am doing some modifications to the cluster, so this output may have some irrelevant info Main index was fixed by restarting the node, not sure why this shard didn't get allocated even though I see corresponding files for this shard on disk.
After doing this manually now its allocated. |
This commit makes two changes to how the in-sync allocations set is updated: - the set is only trimmed when it grows. This prevents trimming too eagerly when the number of replicas was decreased while shards were unassigned. - the allocation id of an active primary that failed is only removed from the in-sync set if another replica gets promoted to primary. This prevents the situation where the only available shard copy in the cluster gets removed the in-sync set. Closes #21719
This commit makes two changes to how the in-sync allocations set is updated: - the set is only trimmed when it grows. This prevents trimming too eagerly when the number of replicas was decreased while shards were unassigned. - the allocation id of an active primary that failed is only removed from the in-sync set if another replica gets promoted to primary. This prevents the situation where the only available shard copy in the cluster gets removed the in-sync set. Closes #21719
Assume a 3-node cluster (1 master, 2 data nodes) with an index with 1 primary and 1 replica. Decommission the node with the replica shard by shutting it down and wiping its state. The master moves the replica shard to unassigned, but keeps its allocation id in the in-sync set as long as no replication operations happen on the primary. Now, decrease the number of replicas to zero. This does not remove the allocation id of the replica from the in-sync set. Shut down the node with the primary. This will move the primary shard to unassigned AND update the in-sync set:
A logic in
IndexMetaDataUpdater
comes into play that limits the number of in-sync replica entries to the maximum number of shards that can be active (namely 1, as we have decreased the number of replicas to 0). The set of in-sync allocation ids contains the allocation id of the primary and the replica. The algorithm inIndexMetaDataUpdater
has no way to chose which id to eliminate from the in-sync replica set and eliminates one at random. Assume the primary is eliminated. After restarting the node with the primary, it cannot be automatically allocated as its shard copy does not match the entry in the in-sync replica set.The text was updated successfully, but these errors were encountered: