-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to dynamically resize threadpools size #16236
Conversation
682ae0d
to
71df521
Compare
❌ Gradle check result for 71df521: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
71df521
to
8d75efd
Compare
server/src/internalClusterTest/java/org/opensearch/cluster/settings/ClusterSettingsIT.java
Outdated
Show resolved
Hide resolved
❌ Gradle check result for 8d75efd: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Thanks @gbbafna! I like the idea of dynamically tuning the threadpool size but I am worried there are some sharp edges for users with a cluster setting. Today the threadpool size can be set per-node via the For example, if we increase the snapshots threadpool size across the cluster, that might improve the performance of taking a snapshot on the data nodes, but have adverse effects on cloning a snapshot on the cluster manager nodes. |
I love the idea of tuning, but that also just adds a lot more complexity that barely any of us understand! What should it be? Giving one threat pool more resources should ideally be balanced by a reduction in another, but how do we know? Is there any way to have an overall "max threads across all these different pools" constraint to let people tweak up one pool as long as they constrain others? Are there stats that let us measure how many times a pool is maximized? |
This are quite valid points @jed326, thanks for them . Definitely, there are sharp edges with this feature. We would recommend only expert users to tune this up . Also since its dynamic, users can easily revert it as well on observation of degradation. On a higher level , i would say we should have a different thread pool for cloning and create snapshot . Recently we introduced another threadpool for deletion as well to independently scale up and down. Also i would put in the documentation to not use this if you have a heterogeneous cluster . Practically , I have seen most of the clusters as homogeneous only . |
Thanks @dbwiddis . I agree it is tricky to tune it and would document that only experts should tune it on a need basis and monitor for any degradation.
We can think about this if this will benefit our users .
Yes, we have thread pool stats to figure that out . |
07c95a7
to
4bedb9c
Compare
❌ Gradle check result for 4bedb9c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for b161100: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
… listener on it Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
b161100
to
cc11f7e
Compare
❌ Gradle check result for cc11f7e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, have left some comments.
server/src/internalClusterTest/java/org/opensearch/cluster/settings/ClusterSettingsIT.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/cluster/settings/ClusterSettingsIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
❕ Gradle check result for 2df2683: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16236 +/- ##
============================================
- Coverage 72.06% 72.00% -0.06%
+ Complexity 64822 64786 -36
============================================
Files 5308 5308
Lines 302574 302613 +39
Branches 43710 43723 +13
============================================
- Hits 218048 217897 -151
- Misses 66648 66797 +149
- Partials 17878 17919 +41 ☔ View full report in Codecov by Sentry. |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16236-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 35c366ddc794e0600184cf406c06ae65061e28ce
# Push it to GitHub
git push --set-upstream origin backport/backport-16236-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
…t#16236) Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
…t#16236) Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
…t#16236) Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
…t#16236) Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
Description
Adds capability to change thread pool sizes for all threadpools defined in core
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.