Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make balanced shards allocator timebound #15239

Merged
merged 23 commits into from
Aug 29, 2024

Conversation

imRishN
Copy link
Member

@imRishN imRishN commented Aug 14, 2024

Description

This PR aims to time bound the reroute duration to finish within a specific timeout so that it allows for URGENT priority tasks that would otherwise be waiting in queue.

For instance time taken by rebalance -

% cat elasticsearch.log | grep "to compute"
[2024-08-21T03:44:42,385][WARN ][o.o.c.s.MasterService    ] [a64d304b34ad798faae32932ab14d605] took [7.8m], which is over [10s], to compute cluster state update for [cluster_reroute(reroute after starting shards)]
[2024-08-21T03:52:45,982][WARN ][o.o.c.s.MasterService    ] [a64d304b34ad798faae32932ab14d605] took [7.7m], which is over [10s], to compute cluster state update for [cluster_reroute(reroute after starting shards)]

Hot threads in master -

curl localhost:9200/_nodes/<node id>/hot_threads?interval=5s
   
   99.9% (4.9s out of 5s) cpu usage by thread 'opensearch[<>][clusterManagerService#updateTask][T#1]'
     9/10 snapshots sharing following 20 elements
       app//org.opensearch.cluster.routing.allocation.allocator.LocalShardsBalancer.balanceByWeights(LocalShardsBalancer.java:440)
       app//org.opensearch.cluster.routing.allocation.allocator.LocalShardsBalancer.balance(LocalShardsBalancer.java:204)
       app//org.opensearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:324)
       app//org.opensearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:576)
       app//org.opensearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:538)
       app//org.opensearch.node.Node$$Lambda$2639/0x0000001800a9ec70.apply(Unknown Source)
       app//org.opensearch.cluster.routing.BatchedRerouteService$1.execute(BatchedRerouteService.java:136)
       app//org.opensearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:67)
       app//org.opensearch.cluster.service.MasterService.executeTasks(MasterService.java:882)
       app//org.opensearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:434)
       app//org.opensearch.cluster.service.MasterService.runTasks(MasterService.java:301)
       app//org.opensearch.cluster.service.MasterService$Batcher.run(MasterService.java:212)
       app//org.opensearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:209)
       app//org.opensearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:247)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:863)
       app//org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:283)
       app//org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:246)
       java.base@17.0.9/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@17.0.9/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@17.0.9/java.lang.Thread.run(Thread.java:840)
     unique snapshot
       app//org.opensearch.cluster.routing.allocation.allocator.LocalShardsBalancer.balanceByWeights(LocalShardsBalancer.java:470)
       app//org.opensearch.cluster.routing.allocation.allocator.LocalShardsBalancer.balance(LocalShardsBalancer.java:204)
       app//org.opensearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:324)
       app//org.opensearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:576)
       app//org.opensearch.cluster.routing.allocation.AllocationService.reroute(AllocationService.java:538)
       app//org.opensearch.node.Node$$Lambda$2639/0x0000001800a9ec70.apply(Unknown Source)
       app//org.opensearch.cluster.routing.BatchedRerouteService$1.execute(BatchedRerouteService.java:136)
       app//org.opensearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:67)
       app//org.opensearch.cluster.service.MasterService.executeTasks(MasterService.java:882)
       app//org.opensearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:434)
       app//org.opensearch.cluster.service.MasterService.runTasks(MasterService.java:301)
       app//org.opensearch.cluster.service.MasterService$Batcher.run(MasterService.java:212)
       app//org.opensearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:209)
       app//org.opensearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:247)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:863)
       app//org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:283)
       app//org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:246)
       java.base@17.0.9/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@17.0.9/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@17.0.9/java.lang.Thread.run(Thread.java:840)

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

❌ Gradle check result for 0e9151c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

❌ Gradle check result for 7fe10d7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

❌ Gradle check result for 8f558d7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

❌ Gradle check result for 9a101b9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, lets add an integ test for the same with deterministic mechanisms to trigger timeouts

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

❌ Gradle check result for 72a10b4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

✅ Gradle check result for 3ba1615: SUCCESS

Copy link
Contributor

❕ Gradle check result for c18d44c: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an explicit reroute?

@imRishN
Copy link
Member Author

imRishN commented Aug 29, 2024

Do we need an explicit reroute?

Usually, if in the current round of reroute any amount of work was attempted (like allocating an unassigned shard, or moving a shard or rebalancing a shard), a reroute will eventually be triggered. If no work was done (allocating no unassigned shards, moving no shards, rebalancing no shards) a subsequent reroute would also be most probably wasteful. But there could be edge cases where a subsequent reroute might help. Opened issue to track this - #14945

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Copy link
Contributor

✅ Gradle check result for 75145b7: SUCCESS

@Bukhtawar Bukhtawar merged commit e982a16 into opensearch-project:main Aug 29, 2024
34 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 29, 2024
* Make balanced shards allocator time bound to prioritise critical operations waiting in the pending task queue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
(cherry picked from commit e982a16)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Bukhtawar pushed a commit that referenced this pull request Aug 29, 2024
* Make balanced shards allocator time bound to prioritise critical operations waiting in the pending task queue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
akolarkunnu pushed a commit to akolarkunnu/OpenSearch that referenced this pull request Sep 10, 2024
* Make balanced shards allocator time bound to prioritise critical operations waiting in the pending task queue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 16, 2024
* Make balanced shards allocator time bound to prioritise critical operations waiting in the pending task queue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 17, 2024
* Make balanced shards allocator time bound to prioritise critical operations waiting in the pending task queue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 21, 2024
* Make balanced shards allocator time bound to prioritise critical operations waiting in the pending task queue

Signed-off-by: Rishab Nahata <rnnahata@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants