Add framework level constructs to track shard indexing pressure. #525

getsaurabh02 · 2021-04-12T18:39:26Z

Description

This PR is 1st of the 4 planned PRs for Shard Indexing Pressure ((#478). It provides the framework level constructs to track shard indexing pressure.

Shard Indexing Pressure introduces smart rejections of indexing requests when there are too many stuck/slow requests in the cluster, breaching key performance thresholds. This prevents the nodes in cluster to run into cascading effects of failures.

Co-authored-by: Dharmesh Singh sdharms@amazon.com

Issues Resolved

Addresses Item 1 of #478

Check List

[+ ] New functionality includes testing.
- [+ ] All tests pass
[+ ] New functionality has been documented.
- [+ ] New functionality has javadoc added
[+] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

odfe-release-bot · 2021-04-12T18:40:44Z

✅ DCO Check Passed 14426c2

odfe-release-bot · 2021-04-12T18:41:51Z

✅ Gradle Wrapper Validation success 14426c2

odfe-release-bot · 2021-04-12T18:45:42Z

✅ Gradle Precommit success 14426c2

Bukhtawar · 2021-04-12T20:27:04Z

server/src/main/java/org/opensearch/index/IndexingPressure.java

-    private final AtomicLong currentCombinedCoordinatingAndPrimaryBytes = new AtomicLong(0);
-    private final AtomicLong currentCoordinatingBytes = new AtomicLong(0);
-    private final AtomicLong currentPrimaryBytes = new AtomicLong(0);
-    private final AtomicLong currentReplicaBytes = new AtomicLong(0);
+    final AtomicLong currentCombinedCoordinatingAndPrimaryBytes = new AtomicLong(0);
+    final AtomicLong currentCoordinatingBytes = new AtomicLong(0);
+    final AtomicLong currentPrimaryBytes = new AtomicLong(0);
+    final AtomicLong currentReplicaBytes = new AtomicLong(0);


I guess the intent is to make it visible to the subclass... If yes make this protected

intent was also the to keep this as package private. Agree there are tradeoffs based on which instance is created ie IndexingPressure vs ShardIndexingPressure

Bukhtawar · 2021-04-12T20:51:54Z

server/src/main/java/org/opensearch/index/ShardIndexingPressureStore.java

+                ShardIndexingPressureTracker newShardIndexingPressureTracker = new ShardIndexingPressureTracker(shardId);
+                newShardIndexingPressureTracker.getPrimaryAndCoordinatingLimits().set(this.shardIndexingPressureSettings
+                    .getShardPrimaryAndCoordinatingBaseLimits());
+                newShardIndexingPressureTracker.getReplicaLimits().set(this.shardIndexingPressureSettings.getShardReplicaBaseLimits());


Lets add base limtis to the constructor itself. So that the new instance gets created with a base limit always.

odfe-release-bot · 2021-04-12T23:50:35Z

✅ DCO Check Passed 57a0032

odfe-release-bot · 2021-04-12T23:50:35Z

✅ Gradle Wrapper Validation success 57a0032

odfe-release-bot · 2021-04-12T23:58:29Z

✅ Gradle Precommit success 57a0032

Bukhtawar · 2021-04-13T05:58:26Z

server/src/main/java/org/opensearch/index/ShardIndexingPressure.java

+
+    public static boolean isShardIndexingPressureAttributeEnabled() {
+        Iterator<DiscoveryNode> nodes = clusterService.state().getNodes().getNodes().valuesIt();
+        while (nodes.hasNext()) {
+            if (Boolean.parseBoolean(nodes.next().getAttributes().get(SHARD_INDEXING_PRESSURE_ENABLED_ATTRIBUTE_KEY)) == false) {
+                return false;
+            }
+        }
+        return true;
+    }


Let's move this to ShardIndexingPressureSettings

makes sense. Moved to ShardIndexingPressureSettings for containment of logic.

Bukhtawar · 2021-04-13T06:18:28Z

server/src/main/java/org/opensearch/index/ShardIndexingPressureMemoryManager.java

+    boolean isReplicaShardLimitBreached(ShardIndexingPressureTracker tracker, long requestStartTime,
+                                        Map<Long, ShardIndexingPressureTracker> shardIndexingPressureStore, long nodeReplicaBytes) {
+
+        //Memory limits is breached when the current utilization is greater than operating_factor.upper of total shard limits.
+        long shardReplicaBytes = tracker.memory().getCurrentReplicaBytes().get();
+        long shardReplicaLimits = tracker.getReplicaLimits().get();
+        final boolean shardMemoryLimitsBreached =
+            ((double)shardReplicaBytes / shardReplicaLimits) > this.upperOperatingFactor;
+
+        if(shardMemoryLimitsBreached) {
+            /*
+            Secondary Parameters(i.e. LastSuccessfulRequestDuration and Throughput) is taken into consideration when
+            the current node utilization is greater than primary_parameter.node.soft_limit of total node limits.
+             */
+            if(((double)nodeReplicaBytes / this.shardIndexingPressureSettings.getNodeReplicaLimits()) < this.nodeSoftLimit)  {
+                boolean isShardLimitsIncreased =
+                    this.increaseShardReplicaLimits(tracker, shardIndexingPressureStore);
+                if(isShardLimitsIncreased == false) {
+                    tracker.rejection().getReplicaNodeLimitsBreachedRejections().incrementAndGet();
+                    totalNodeLimitsBreachedRejections.incrementAndGet();
+                }
+
+                return !isShardLimitsIncreased;
+            } else {
+                boolean shardLastSuccessfulRequestDurationLimitsBreached =
+                    this.evaluateLastSuccessfulRequestDurationLimitsBreached(
+                        tracker.timeStamp().getLastSuccessfulReplicaRequestTimestamp().get(),
+                        requestStartTime, tracker.outstandingRequest().getTotalOutstandingReplicaRequests().get());
+
+                boolean shardThroughputDegradationLimitsBreached =
+                    this.evaluateThroughputDegradationLimitsBreached(
+                        Double.longBitsToDouble(tracker.throughput().getReplicaThroughputMovingAverage().get()),
+                        tracker.memory().getTotalReplicaBytes().get(), tracker.latency().getReplicaTimeInMillis().get(),
+                        tracker.throughput().getReplicaThroughputMovingQueue().size(), replicaThroughputDegradationLimits);
+
+                if (shardLastSuccessfulRequestDurationLimitsBreached || shardThroughputDegradationLimitsBreached) {
+                    if(shardLastSuccessfulRequestDurationLimitsBreached) {
+                        tracker.rejection().getReplicaLastSuccessfulRequestLimitsBreachedRejections().incrementAndGet();
+                        totalLastSuccessfulRequestLimitsBreachedRejections.incrementAndGet();
+                    } else {
+                        tracker.rejection().getReplicaThroughputDegradationLimitsBreachedRejections().incrementAndGet();
+                        totalThroughputDegradationLimitsBreachedRejections.incrementAndGet();
+                    }
+
+                    return true;
+                } else {
+                    boolean isShardLimitsIncreased =
+                        this.increaseShardReplicaLimits(tracker, shardIndexingPressureStore);
+                    if(isShardLimitsIncreased == false) {
+                        tracker.rejection().getReplicaNodeLimitsBreachedRejections().incrementAndGet();
+                        totalNodeLimitsBreachedRejections.incrementAndGet();
+                    }
+
+                    return !isShardLimitsIncreased;
+                }
+            }
+        } else {
+            return false;
+        }
+    }


Lets add this to meta as a fast follow up to break this monolith

Yes, added in #478 as an item list.

Bukhtawar · 2021-04-13T06:29:38Z

server/src/main/java/org/opensearch/index/ShardIndexingPressureStore.java

+    private void updateIndexingPressureColdStore(ShardIndexingPressureTracker tracker) {
+        if (shardIndexingPressureColdStore.size() > maxColdStoreSize) {
+            shardIndexingPressureColdStore.clear();
+        }
+        shardIndexingPressureColdStore.put((long)tracker.getShardId().hashCode(), tracker);
+    }


Clearing the cold store just because it got full doesn't seem ideal. Maybe we can improve the data structure used here

Bukhtawar

Thanks Saurabh/Dharmesh for the changes. LGTM overall, lets revisit this.

odfe-release-bot · 2021-04-13T06:43:35Z

✅ Gradle Wrapper Validation success bce6e90

odfe-release-bot · 2021-04-13T06:43:35Z

✅ DCO Check Passed bce6e90

odfe-release-bot · 2021-04-13T06:50:38Z

✅ Gradle Precommit success bce6e90

nknize · 2021-04-14T19:18:25Z

start gradle check

odfe-release-bot · 2021-04-14T19:32:08Z

❌ Gradle Check failure bce6e90
Log 72

Reports 72

odfe-release-bot · 2021-04-15T12:05:21Z

✅ Gradle Wrapper Validation success 7c4c654ccb6187edf30ca39e2434b68f4552795f

odfe-release-bot · 2021-04-15T12:05:21Z

✅ DCO Check Passed 7c4c654ccb6187edf30ca39e2434b68f4552795f

odfe-release-bot · 2021-04-15T12:12:55Z

✅ Gradle Precommit success 7c4c654ccb6187edf30ca39e2434b68f4552795f

psychbot · 2021-04-15T12:14:07Z

@nknize Fixed some of the integ tests, Can you please trigger start gradle check again.

nknize · 2021-04-15T12:15:58Z

start gradle check

odfe-release-bot · 2021-04-15T12:38:24Z

❌ Gradle Check failure 7c4c654ccb6187edf30ca39e2434b68f4552795f
Log 83

Reports 83

odfe-release-bot · 2021-04-15T13:36:27Z

❌ Gradle Precommit failure 0a91e90452a4110e00c7757ec9f527cfe1059a6f
Log 221

odfe-release-bot · 2021-04-15T13:36:41Z

❌ Gradle Wrapper Validation failure 0a91e90452a4110e00c7757ec9f527cfe1059a6f

:alert: Gradle Wrapper integrity has been altered

odfe-release-bot · 2021-04-15T13:36:51Z

✅ DCO Check Passed 13cf3d26fff187dcc308530cbf9d95085e8ad43c

odfe-release-bot · 2021-04-15T13:36:51Z

✅ Gradle Wrapper Validation success 13cf3d26fff187dcc308530cbf9d95085e8ad43c

odfe-release-bot · 2021-04-19T18:24:16Z

✅ Gradle Precommit success da37500

peterzhuamazon · 2021-04-19T18:25:47Z

start gradle check

odfe-release-bot · 2021-04-19T18:41:38Z

❌ Gradle Check failure da37500
Log 106

Reports 106

Signed-off-by: Dharmesh 💤 <sdharms@amazon.com>

odfe-release-bot · 2021-04-19T19:19:41Z

✅ Gradle Wrapper Validation success c069b09

odfe-release-bot · 2021-04-19T19:19:41Z

✅ DCO Check Passed c069b09

odfe-release-bot · 2021-04-19T19:26:58Z

✅ Gradle Precommit success c069b09

peterzhuamazon · 2021-04-19T19:33:00Z

start gradle check

odfe-release-bot · 2021-04-19T19:57:34Z

❌ Gradle Check failure c069b09
Log 108

Reports 108

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

odfe-release-bot · 2021-04-20T04:28:46Z

✅ DCO Check Passed 1b6bc7d

odfe-release-bot · 2021-04-20T04:28:46Z

✅ Gradle Wrapper Validation success 1b6bc7d

odfe-release-bot · 2021-04-20T04:34:33Z

✅ Gradle Precommit success 1b6bc7d

peterzhuamazon · 2021-04-20T16:54:57Z

start gradle check

odfe-release-bot · 2021-04-20T17:22:29Z

✅ Gradle Check success 1b6bc7d
Log 117

Reports 117

nknize

Just starting to look at the framework changes so I'll dig a little deeper. First thought I'm curious why this is introducing an IndexingPressure specific service and not built into IndicesService which is used at the node level anyway?

getsaurabh02 · 2021-04-22T03:53:09Z

Thanks @nknize. IndexingPressureService here acts as an orchestrator for the underlying low level constructs such as IndexingPressure and ShardIndexingPressure, and abstracts out the core & routing logic based on the dynamic setting values (preferences). This allows isolating the low level constructs with dedicated responsibility, as with IndexingPressure vs ShardIndexingPressure. It also simplifies addition of new similar constructs in future and even helps deprecation of the older constructs faster, such as we envision deprecating IndexingPressure eventually once we gain sufficient confidence on ShardIndexingPressure. Therefore having IndexingPressureService hugely simplifies the plumbing logic for callers, and promotes independent improvement on the core constructs in isolation. Also going forward, IndexingPressureService will have required constructs for exposing listeners/interfaces for plugin development.

nknize · 2021-04-22T04:24:26Z

such as we envision deprecating IndexingPressure eventually once we gain sufficient confidence on ShardIndexingPressure

+1 to this plan!

I understand the benefits to isolating the logic for "future" considerations and am not questioning that choice. I'm more specifically curious (keeping in mind that I haven't dug too deep into the changes yet) why this is a new service at the NodeService layer instead of placing that isolated index monitoring logic in the existing IndicesService (or maybe even down in the IndexService) where you still have access to it through the APIs? Was that choice only because the original IndexingPressure implementation (for adding to NodeStats API) was already in the NodeService?

getsaurabh02 · 2021-04-22T04:50:49Z

IndexingPressure implementation (for adding to NodeStats API) was already in the NodeService

This is somewhat true. And the reason behind is that the tracking/monitoring done by the IndexingPressure is agnostic of the actual indices lifecycle on the node/cluster, rather is just operates on the node level write operations coming in.

Also the implementation of IndicesService focusses pretty much on the internals of the index management, such as index lifecycle, metadata setup, indices cache, field data etc. However the ShardIndexingPressure and IndexingPressure are not coupled with the actual indices lifecycle & states. It rather taps into the incoming write requests at the transport layers on a Node and performs aggregation/accounting. It then maintain states based on in-flight requests, to perform rejections of new request if the key performance thresholds are breached. Being closer to node operations NodeService seemed to be an ideal spot.

getsaurabh02 · 2021-04-27T05:43:00Z

Hi @nknize Please let me know if you have more thoughts.

opensearch-ci-bot · 2021-07-02T22:15:17Z

✅ Gradle Wrapper Validation success 1b6bc7d

opensearch-ci-bot · 2021-07-02T22:16:36Z

✅ DCO Check Passed 1b6bc7d

opensearch-ci-bot · 2021-07-02T22:48:13Z

✅ Gradle Precommit success 1b6bc7d

opensearch-ci-bot · 2021-09-08T17:49:57Z

✅ Gradle Precommit success 1b6bc7d

opensearch-ci-bot · 2021-09-29T21:21:42Z

Can one of the admins verify this patch?

getsaurabh02 mentioned this pull request Apr 12, 2021

Add framework level constructs to track shard indexing pressure. #496

Closed

Bukhtawar reviewed Apr 12, 2021

View reviewed changes

Bukhtawar reviewed Apr 13, 2021

View reviewed changes

Bukhtawar approved these changes Apr 13, 2021

View reviewed changes

getsaurabh02 mentioned this pull request Apr 13, 2021

[Meta] Shard level Indexing Back-Pressure #478

Closed

psychbot force-pushed the shard_indexing_pressure_part_1 branch from 0a91e90 to 13cf3d2 Compare April 15, 2021 13:33

Changed the max & min number of threads to fix tests

c069b09

Signed-off-by: Dharmesh 💤 <sdharms@amazon.com>

Add node attribute check in the Cluster Allocation explain tests.

1b6bc7d

Signed-off-by: Saurabh Singh <sisurab@amazon.com>

nknize reviewed Apr 22, 2021

View reviewed changes

nknize added enhancement Enhancement or improvement to existing feature or request opendistro-port Features ported from OpenDistro stalled Issues that have stalled v2.0.0 Version 2.0.0 labels Jun 24, 2021

getsaurabh02 closed this Oct 8, 2021

Add framework level constructs to track shard indexing pressure. #525

Add framework level constructs to track shard indexing pressure. #525

Conversation

getsaurabh02 commented Apr 12, 2021

Description

Issues Resolved

Check List

odfe-release-bot commented Apr 12, 2021

odfe-release-bot commented Apr 12, 2021

odfe-release-bot commented Apr 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odfe-release-bot commented Apr 12, 2021

odfe-release-bot commented Apr 12, 2021

odfe-release-bot commented Apr 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bukhtawar left a comment

Choose a reason for hiding this comment

odfe-release-bot commented Apr 13, 2021

odfe-release-bot commented Apr 13, 2021

odfe-release-bot commented Apr 13, 2021

nknize commented Apr 14, 2021

odfe-release-bot commented Apr 14, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

psychbot commented Apr 15, 2021

nknize commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 15, 2021

odfe-release-bot commented Apr 19, 2021

peterzhuamazon commented Apr 19, 2021

odfe-release-bot commented Apr 19, 2021

odfe-release-bot commented Apr 19, 2021

odfe-release-bot commented Apr 19, 2021

odfe-release-bot commented Apr 19, 2021

peterzhuamazon commented Apr 19, 2021

odfe-release-bot commented Apr 19, 2021

odfe-release-bot commented Apr 20, 2021

odfe-release-bot commented Apr 20, 2021

odfe-release-bot commented Apr 20, 2021

peterzhuamazon commented Apr 20, 2021

odfe-release-bot commented Apr 20, 2021

nknize left a comment • edited Loading

Choose a reason for hiding this comment

getsaurabh02 commented Apr 22, 2021

nknize commented Apr 22, 2021

getsaurabh02 commented Apr 22, 2021 • edited Loading

getsaurabh02 commented Apr 27, 2021

opensearch-ci-bot commented Jul 2, 2021

opensearch-ci-bot commented Jul 2, 2021

opensearch-ci-bot commented Jul 2, 2021

opensearch-ci-bot commented Sep 8, 2021

opensearch-ci-bot commented Sep 29, 2021

nknize left a comment •

edited

Loading

getsaurabh02 commented Apr 22, 2021 •

edited

Loading