-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Remote Store] Update index settings once all shard copies of an index moves to remote enabled nodes #13253
[Remote Store] Update index settings once all shard copies of an index moves to remote enabled nodes #13253
Conversation
❌ Gradle check result for 498ec5c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
maybeUpdatedState = allocationService.applyStartedShards(currentState, shardRoutingsToBeApplied); | ||
// Run remote store migration based tasks | ||
if (ongoingDocrepToRemoteMigration(currentState.metadata().settings())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can reuse RemoteStoreNodeService#isMigratingToRemoteStore
&& (indexRoutingTable.shardsMatchingPredicateCount(ShardRouting::started) == indexRoutingTable.shardsMatchingPredicateCount( | ||
shardRouting -> discoveryNodes.get(shardRouting.currentNodeId()).isRemoteStoreNode() | ||
)); | ||
return allStartedShardsOnRemote && IndexMetadata.INDEX_REMOTE_STORE_ENABLED_SETTING.get(indexMetadata.getSettings()) == false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can short circuit on IndexMetadata.INDEX_REMOTE_STORE_ENABLED_SETTING.get(indexMetadata.getSettings()) == false
at the start of this function .
&& (indexRoutingTable.shardsMatchingPredicateCount(ShardRouting::started) == indexRoutingTable.shardsMatchingPredicateCount( | ||
shardRouting -> discoveryNodes.get(shardRouting.currentNodeId()).isRemoteStoreNode() | ||
)); | ||
return allStartedShardsOnRemote && IndexMetadata.INDEX_REMOTE_STORE_ENABLED_SETTING.get(indexMetadata.getSettings()) == false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check of (indexRoutingTable.shardsMatchingPredicateCount(ShardRouting::started) == indexRoutingTable.shardsMatchingPredicateCount( shardRouting -> discoveryNodes.get(shardRouting.currentNodeId()).isRemoteStoreNode()
is not accurate , as it is not account for started shards for checking shards on remote store . We should check just for all started shards that they are on remote node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to make sure that all the primary shards are assigned as well . There can be case when 1 shard with 0 replica is unassigned , but that copy is present on a docrep node, which is temporarily down.
* @param discoveryNodes Current set of {@link DiscoveryNodes} in the cluster | ||
* @return {@link Tuple} with segment repository name as first element and translog repository name as second element | ||
*/ | ||
public static Tuple<String, String> getRemoteStoreRepositoryNames(DiscoveryNodes discoveryNodes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function doesn't fit well . Tuple also doesn't look the right datastructure for this . Would prefer two functions for retrieving translog repo and segment repo.
❌ Gradle check result for da6b91a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for c11bf32: ABORTED Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 4e15cf9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
4e15cf9
to
da6b91a
Compare
da6b91a
to
8d9c389
Compare
❌ Gradle check result for da6b91a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 8d9c389: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
Added logic within the
ShardStartedClusterStateTaskExecutor
to apply remote based index settings and add remote path based index metadata once all shard copies of an index moves over to remote store enabled nodes. Currently this logic would only execute when the cluster is inmixed
mode and there are remote enabled nodes present in the cluster.The
execute
logic of this cluster state executor fetches the index names whose shards has been marked asSTARTED
, references the currentRoutingTable
from the incomingClusterState
to figure out if all shards of the index are inSTARTED
state and all those shard copies are in remote store enabled nodes. If so, it mutates the incoming metadata by adding the remote store based settings, which is then persisted and published to the data nodes from the cluster managerRelated Issues
Resolves: #13252
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.