Add shard indexing pressure integration tests. #1198

getsaurabh02 · 2021-09-01T21:18:46Z

Signed-off-by: Saurabh Singh sisurab@amazon.com

Description

This PR is next among the planned PRs for Shard Indexing Pressure (#478). It introduces the shard indexing pressure integration ITs and refactoring for indexing pressure ITs.

Issues Resolved

Addresses Item 10 of #478

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

opensearch-ci-bot · 2021-09-01T21:19:55Z

✅ Gradle Wrapper Validation success 74f1e86f66aba0fa1e241b19ebd690b585f3c8f1

opensearch-ci-bot · 2021-09-01T21:20:57Z

✅ DCO Check Passed 74f1e86f66aba0fa1e241b19ebd690b585f3c8f1

opensearch-ci-bot · 2021-09-01T21:23:31Z

❌ Gradle Precommit failure 74f1e86f66aba0fa1e241b19ebd690b585f3c8f1
Log 1049

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

opensearch-ci-bot · 2021-09-01T21:46:45Z

✅ DCO Check Passed e01e4b7

opensearch-ci-bot · 2021-09-01T21:47:54Z

✅ Gradle Wrapper Validation success e01e4b7

opensearch-ci-bot · 2021-09-01T21:51:32Z

✅ Gradle Precommit success e01e4b7

Bukhtawar · 2021-09-07T17:56:10Z

server/src/test/java/org/opensearch/index/ShardIndexingPressureMemoryManagerTests.java

@@ -181,8 +183,9 @@ public void testCoordinatingPrimarySoftLimitBreachedAndLastSuccessfulRequestLimi
        long limit1 = tracker1.getPrimaryAndCoordinatingLimits();
        long limit2 = tracker2.getPrimaryAndCoordinatingLimits();
        long requestStartTime = System.nanoTime();
+        long delay = TimeUnit.MILLISECONDS.toNanos(100);


Can we randomize delay?

Any value greater than 20ms here should do. Given this delay is not testing any limits or is not dependent on completion of other task/activity, kept it a fixed value. Let me know if you still think otherwise.

Bukhtawar · 2021-09-07T17:57:32Z

server/src/internalClusterTest/java/org/opensearch/index/ShardIndexingPressureSettingsIT.java

+            successFuture = client(primaryName).bulk(bulkRequest);
+            secondSuccessFuture = client(primaryName).bulk(bulkRequest);
+        }
+        Thread.sleep(25);


These sleeps might cause flaky tests

This is to introduce delay between two sequential calls (nothing concurrent), and any delay above 10ms defined as part of SUCCESSFUL_REQUEST_ELAPSED_TIMEOUT above should be sufficient.
Since we are not intending on any background activity/tasks to finish, this will not result into flaky tests.

Bukhtawar · 2021-09-07T17:57:48Z

server/src/internalClusterTest/java/org/opensearch/index/ShardIndexingPressureSettingsIT.java

+            successFuture = client(primaryName).bulk(bulkRequest);
+        }
+        // Delay to breach the success time stamp threshold
+        Thread.sleep(3000);


How are we deciding on sleeps

Default value for SUCCESSFUL_REQUEST_ELAPSED_TIMEOUT is 3000ms. Here we are introducing the sequential delay to mimic delay in completion of the request.

Bukhtawar · 2021-09-08T13:53:48Z

server/src/internalClusterTest/java/org/opensearch/index/IndexingPressureIT.java

@@ -74,11 +74,15 @@

    private static final Settings unboundedWriteQueue = Settings.builder().put("thread_pool.write.queue_size", -1).build();

+    public static final Settings settings = Settings.builder()
+        .put(ShardIndexingPressureSettings.SHARD_INDEXING_PRESSURE_ENABLED.getKey(), false).build();


Tests should cover both?

These tests is for node level Indexing pressure and hence this setting should be off here. For true we it covered as part of ShardIndexingPressureIT.java

Bukhtawar · 2021-09-08T13:54:10Z

server/src/internalClusterTest/java/org/opensearch/index/ShardIndexingPressureIT.java

+    private static final Settings unboundedWriteQueue = Settings.builder().put("thread_pool.write.queue_size", -1).build();
+
+    public static final Settings settings = Settings.builder()
+        .put(ShardIndexingPressureSettings.SHARD_INDEXING_PRESSURE_ENABLED.getKey(), true).build();


tests should cover both?

These tests are for shard level indexing pressure, and hence this settings is true here. The other false scenarios are covered as part of IndexingPressureIT.java.
Also, there are tests to verify toggling of this setting as part ShardIndexingPressureSettingsIT.java of such as testShardIndexingPressureFeatureEnabledDisabledSetting

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

Shard level indexing pressure improves the current Indexing Pressure framework which performs memory accounting at node level and rejects the requests. This takes a step further to have rejections based on the memory accounting at shard level along with other key performance factors like throughput and last successful requests. **Key features** - Granular tracking of indexing tasks performance, at every shard level, for each node role i.e. coordinator, primary and replica. - Smarter rejections by discarding the requests intended only for problematic index or shard, while still allowing others to continue (fairness in rejection). - Rejections thresholds governed by combination of configurable parameters (such as memory limits on node) and dynamic parameters (such as latency increase, throughput degradation). - Node level and shard level indexing pressure statistics exposed through stats api. - Integration of Indexing pressure stats with Plugins for for metric visibility and auto-tuning in future. - Control knobs to tune to the key performance thresholds which control rejections, to address any specific requirement or issues. - Control knobs to run the feature in shadow-mode or enforced-mode. In shadow-mode only internal rejection breakdown metrics will be published while no actual rejections will be performed. The changes were divided into small manageable chunks as part of the following PRs against a feature branch. - Add Shard Indexing Pressure Settings. #716 - Add Shard Indexing Pressure Tracker. #717 - Refactor IndexingPressure to allow extension. #718 - Add Shard Indexing Pressure Store #838 - Add Shard Indexing Pressure Memory Manager #945 - Add ShardIndexingPressure framework level construct and Stats #1015 - Add Indexing Pressure Service which acts as orchestrator for IP #1084 - Add plumbing logic for IndexingPressureService in Transport Actions. #1113 - Add shard indexing pressure metric/stats via rest end point. #1171 - Add shard indexing pressure integration tests. #1198 Signed-off-by: Saurabh Singh <sisurab@amazon.com> Co-authored-by: Saurabh Singh <sisurab@amazon.com> Co-authored-by: Rabi Panda <adnapibar@gmail.com>

Shard level indexing pressure improves the current Indexing Pressure framework which performs memory accounting at node level and rejects the requests. This takes a step further to have rejections based on the memory accounting at shard level along with other key performance factors like throughput and last successful requests. **Key features** - Granular tracking of indexing tasks performance, at every shard level, for each node role i.e. coordinator, primary and replica. - Smarter rejections by discarding the requests intended only for problematic index or shard, while still allowing others to continue (fairness in rejection). - Rejections thresholds governed by combination of configurable parameters (such as memory limits on node) and dynamic parameters (such as latency increase, throughput degradation). - Node level and shard level indexing pressure statistics exposed through stats api. - Integration of Indexing pressure stats with Plugins for for metric visibility and auto-tuning in future. - Control knobs to tune to the key performance thresholds which control rejections, to address any specific requirement or issues. - Control knobs to run the feature in shadow-mode or enforced-mode. In shadow-mode only internal rejection breakdown metrics will be published while no actual rejections will be performed. The changes were divided into small manageable chunks as part of the following PRs against a feature branch. - Add Shard Indexing Pressure Settings. opensearch-project#716 - Add Shard Indexing Pressure Tracker. opensearch-project#717 - Refactor IndexingPressure to allow extension. opensearch-project#718 - Add Shard Indexing Pressure Store opensearch-project#838 - Add Shard Indexing Pressure Memory Manager opensearch-project#945 - Add ShardIndexingPressure framework level construct and Stats opensearch-project#1015 - Add Indexing Pressure Service which acts as orchestrator for IP opensearch-project#1084 - Add plumbing logic for IndexingPressureService in Transport Actions. opensearch-project#1113 - Add shard indexing pressure metric/stats via rest end point. opensearch-project#1171 - Add shard indexing pressure integration tests. opensearch-project#1198 Signed-off-by: Saurabh Singh <sisurab@amazon.com> Co-authored-by: Saurabh Singh <sisurab@amazon.com> Co-authored-by: Rabi Panda <adnapibar@gmail.com>

Shard level indexing pressure improves the current Indexing Pressure framework which performs memory accounting at node level and rejects the requests. This takes a step further to have rejections based on the memory accounting at shard level along with other key performance factors like throughput and last successful requests. **Key features** - Granular tracking of indexing tasks performance, at every shard level, for each node role i.e. coordinator, primary and replica. - Smarter rejections by discarding the requests intended only for problematic index or shard, while still allowing others to continue (fairness in rejection). - Rejections thresholds governed by combination of configurable parameters (such as memory limits on node) and dynamic parameters (such as latency increase, throughput degradation). - Node level and shard level indexing pressure statistics exposed through stats api. - Integration of Indexing pressure stats with Plugins for for metric visibility and auto-tuning in future. - Control knobs to tune to the key performance thresholds which control rejections, to address any specific requirement or issues. - Control knobs to run the feature in shadow-mode or enforced-mode. In shadow-mode only internal rejection breakdown metrics will be published while no actual rejections will be performed. The changes were divided into small manageable chunks as part of the following PRs against a feature branch. - Add Shard Indexing Pressure Settings. #716 - Add Shard Indexing Pressure Tracker. #717 - Refactor IndexingPressure to allow extension. #718 - Add Shard Indexing Pressure Store #838 - Add Shard Indexing Pressure Memory Manager #945 - Add ShardIndexingPressure framework level construct and Stats #1015 - Add Indexing Pressure Service which acts as orchestrator for IP #1084 - Add plumbing logic for IndexingPressureService in Transport Actions. #1113 - Add shard indexing pressure metric/stats via rest end point. #1171 - Add shard indexing pressure integration tests. #1198 Signed-off-by: Saurabh Singh <sisurab@amazon.com> Co-authored-by: Saurabh Singh <sisurab@amazon.com> Co-authored-by: Rabi Panda <adnapibar@gmail.com>

getsaurabh02 mentioned this pull request Sep 1, 2021

[Meta] Shard level Indexing Back-Pressure #478

Closed

Add shard indexing pressure integration tests.

e01e4b7

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

getsaurabh02 force-pushed the 10_IT branch from 74f1e86 to e01e4b7 Compare September 1, 2021 21:45

Bukhtawar reviewed Sep 7, 2021

View reviewed changes

Bukhtawar reviewed Sep 8, 2021

View reviewed changes

Bukhtawar approved these changes Sep 9, 2021

View reviewed changes

shwetathareja merged commit 23c5904 into opensearch-project:feature/478_indexBackPressure Sep 9, 2021

adnapibar pushed a commit that referenced this pull request Sep 15, 2021

Add shard indexing pressure integration tests. (#1198)

052bd68

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

adnapibar pushed a commit that referenced this pull request Sep 15, 2021

Add shard indexing pressure integration tests. (#1198)

3b13569

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

adnapibar mentioned this pull request Sep 28, 2021

Merge shard level Indexing back-pressure feature branch #1310

Closed

5 tasks

getsaurabh02 added a commit to getsaurabh02/OpenSearch that referenced this pull request Oct 6, 2021

Add shard indexing pressure integration tests. (opensearch-project#1198)

e9cfb0b

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

getsaurabh02 mentioned this pull request Oct 6, 2021

Shard Indexing Pressure changes to be merged #1336

Merged

5 tasks

getsaurabh02 mentioned this pull request Oct 7, 2021

[Backport] Add Shard Level Indexing Pressure (#478) to 1.x #1343

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add shard indexing pressure integration tests. #1198

Add shard indexing pressure integration tests. #1198

getsaurabh02 commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

Bukhtawar Sep 7, 2021

getsaurabh02 Sep 8, 2021

Bukhtawar Sep 7, 2021

getsaurabh02 Sep 8, 2021

Bukhtawar Sep 7, 2021

getsaurabh02 Sep 8, 2021

Bukhtawar Sep 8, 2021

getsaurabh02 Sep 9, 2021

Bukhtawar Sep 8, 2021

getsaurabh02 Sep 9, 2021

Add shard indexing pressure integration tests. #1198

Add shard indexing pressure integration tests. #1198

Conversation

getsaurabh02 commented Sep 1, 2021

Description

Issues Resolved

Check List

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

opensearch-ci-bot commented Sep 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment