Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport] Add Shard Level Indexing Pressure (#478) to 1.x #1343

Merged
merged 1 commit into from
Oct 11, 2021

Conversation

getsaurabh02
Copy link
Member

This is backport PR for #478 to 1.x from main

Shard level indexing pressure improves the current Indexing Pressure framework which performs memory accounting at node level and rejects the requests. This takes a step further to have rejections based on the memory accounting at shard level along with other key performance factors like throughput and last successful requests.

Key features

  • Granular tracking of indexing tasks performance, at every shard level, for each node role i.e. coordinator, primary and replica.
  • Smarter rejections by discarding the requests intended only for problematic index or shard, while still allowing others to continue (fairness in rejection).
  • Rejections thresholds governed by combination of configurable parameters (such as memory limits on node) and dynamic parameters (such as latency increase, throughput degradation).
  • Node level and shard level indexing pressure statistics exposed through stats api.
  • Integration of Indexing pressure stats with Plugins for for metric visibility and auto-tuning in future.
  • Control knobs to tune to the key performance thresholds which control rejections, to address any specific requirement or issues.
  • Control knobs to run the feature in shadow-mode or enforced-mode. In shadow-mode only internal rejection breakdown metrics will be published while no actual rejections will be performed.

The changes were divided into small manageable chunks as part of the following PRs against a feature branch.

Signed-off-by: Saurabh Singh sisurab@amazon.com
Co-authored-by: Saurabh Singh sisurab@amazon.com
Co-authored-by: Rabi Panda adnapibar@gmail.com

Description

[Describe what this change achieves]

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Shard level indexing pressure improves the current Indexing Pressure framework which performs memory accounting at node level and rejects the requests. This takes a step further to have rejections based on the memory accounting at shard level along with other key performance factors like throughput and last successful requests.

**Key features**
- Granular tracking of indexing tasks performance, at every shard level, for each node role i.e. coordinator, primary and replica.
- Smarter rejections by discarding the requests intended only for problematic index or shard, while still allowing others to continue (fairness in rejection).
- Rejections thresholds governed by combination of configurable parameters (such as memory limits on node) and dynamic parameters (such as latency increase, throughput degradation).
- Node level and shard level indexing pressure statistics exposed through stats api.
- Integration of Indexing pressure stats with Plugins for for metric visibility and auto-tuning in future.
- Control knobs to tune to the key performance thresholds which control rejections, to address any specific requirement or issues.
- Control knobs to run the feature in shadow-mode or enforced-mode. In shadow-mode only internal rejection breakdown metrics will be published while no actual rejections will be performed.

The changes were divided into small manageable chunks as part of the following PRs against a feature branch.

- Add Shard Indexing Pressure Settings. opensearch-project#716
- Add Shard Indexing Pressure Tracker. opensearch-project#717
- Refactor IndexingPressure to allow extension. opensearch-project#718
- Add Shard Indexing Pressure Store opensearch-project#838
- Add Shard Indexing Pressure Memory Manager opensearch-project#945
- Add ShardIndexingPressure framework level construct and Stats opensearch-project#1015
- Add Indexing Pressure Service which acts as orchestrator for IP opensearch-project#1084
- Add plumbing logic for IndexingPressureService in Transport Actions. opensearch-project#1113
- Add shard indexing pressure metric/stats via rest end point. opensearch-project#1171
- Add shard indexing pressure integration tests. opensearch-project#1198

Signed-off-by: Saurabh Singh <sisurab@amazon.com>
Co-authored-by: Saurabh Singh <sisurab@amazon.com>
Co-authored-by: Rabi Panda <adnapibar@gmail.com>
@opensearch-ci-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Wrapper Validation success 83264db

@opensearch-ci-bot
Copy link
Collaborator

✅   DCO Check Passed 83264db

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Precommit success 83264db

@adnapibar adnapibar added the backport PRs or issues specific to backporting features or enhancments label Oct 8, 2021
@adnapibar
Copy link
Contributor

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 83264db
Log 646

Reports 646

@getsaurabh02
Copy link
Member Author

Unable to access Log 646. Reason : AccessDenied

Screenshot 2021-10-08 at 11 00 29 PM

@adnapibar
Copy link
Contributor

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 83264db
Log 648

Reports 648

@adnapibar
Copy link
Contributor

Failed tests not reproducible

> Task :server:internalClusterTest FAILED
> Task :modules:reindex:oldEs2Fixture#stop
> Task :modules:reindex:oldEs1Fixture#stop
> Task :modules:reindex:oldEs090Fixture#stop

@adnapibar
Copy link
Contributor

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 83264db
Log 653

Reports 653

@tlfeng tlfeng merged commit 11237da into opensearch-project:1.x Oct 11, 2021
nknize added a commit to nknize/OpenSearch that referenced this pull request Nov 18, 2021
…pensearch-project#1343)"

This reverts commit 11237da.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport PRs or issues specific to backporting features or enhancments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants