-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize NodeIndicesStats output behind flag #14454
Optimize NodeIndicesStats output behind flag #14454
Conversation
❌ Gradle check result for 59a0f5d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/indices/IndicesService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/NodeIndicesStats.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/NodeIndicesStats.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/NodeIndicesStats.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/nodestats/NodeStatsIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
59a0f5d
to
f6d3b58
Compare
❌ Gradle check result for f6d3b58: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for b741868: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…-ClusterStats-NodeIndicesStats3 Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
b741868
to
a267aef
Compare
❌ Gradle check result for a267aef: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 517baa5: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…-ClusterStats-NodeIndicesStats3 Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
517baa5
to
e009a84
Compare
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
❌ Gradle check result for e009a84: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 857e7e1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
❌ Gradle check result for 9876c13: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
❌ Gradle check result for 61bf2e0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
❌ Gradle check result for 73de188: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
For org.opensearch.discovery.ClusterDisruptionIT.classMethod
Similar to failures in Flaky Test Issue - #14308
Similar failures in Flaky Test Issue - #14331 For org.opensearch.index.ShardIndexingPressureSettingsIT.testShardIndexingPressureLastSuccessfulSettingsUpdate |
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #14454 +/- ##
============================================
- Coverage 71.87% 71.83% -0.04%
- Complexity 63318 63415 +97
============================================
Files 5231 5244 +13
Lines 296521 296864 +343
Branches 42832 42868 +36
============================================
+ Hits 213113 213250 +137
- Misses 65948 66097 +149
- Partials 17460 17517 +57 ☔ View full report in Codecov by Sentry. |
* Optimize NodeIndicesStats output behind flag Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com> (cherry picked from commit e146f13) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Optimize NodeIndicesStats output behind flag Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com> (cherry picked from commit e146f13) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Optimize NodeIndicesStats output behind flag Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
* Optimize NodeIndicesStats output behind flag Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
* Optimize NodeIndicesStats output behind flag Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
* Optimize NodeIndicesStats output behind flag Signed-off-by: Pranshu Shukla <pranshushukla06@gmail.com>
Description:
As of today the APIs which fetch Node Stats with indices stats currently returns all shard statistics regardless of the requested level (node, level, or indices) to the coordinator node (node where-in or from which the rest request was created). This unnecessary iteration on the coordinator node leads to overhead and potential health issues. Furthermore, the coordinator node repeats index-level statistics calculation for each node response even when the requested level is indices.
This pull request proposes optimizations to the NodeStats API to pre-compute and return only the necessary information to generate the response. The APIs would now perform pre-computation of shard or index-level statistics on remote nodes (individual nodes which respond to the coordinator node) based on the level parameter in the REST request -
nodes
,indices
orshards
. Only for level =shards
does the response contains the shard level stats.The optimisation hides behind a flag transport level flag
optimizeNodeIndicesStatsOnLevel
part of CommonStatsFlag. This is enabled in RestNodeStatsAction as well as RestNodeAction API calls as we do not require it during these code paths.Pre-computation on remote nodes minimizes data transfer and processing on the coordinator node, leading to significant performance gains, especially in large clusters. Reduced load on the coordinator node contributes to a healthier state by minimizing GC pauses and CPU spikes.
Testing:
Extensive testing with a 20k shards cluster has validated the effectiveness of these optimizations.
Related Issues
Resolves #13340
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.