Search Task Resource Tracking PoC #1643

tushar-kharbanda72 · 2021-12-02T08:54:37Z

Description

This PoC is to capture the system resource overhead for Search requests either on data node or coordinator node. It captures all the heap allocations being done by the threads running on search threadpool. It also captures the heap overhead for the responses received on coordinator while it is waiting for other data nodes to respond back (this serialization of response is generally done on transport_worker and doesn't get accounted for search threadpool tracking).

This PoC just aims to validate correctness and performance impact. Post these validations I'll focus more on concrete design aspect of the changes

Issues Resolved

#1179

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

opensearch-ci-bot · 2021-12-02T08:56:02Z

Can one of the admins verify this patch?

opensearch-ci-bot · 2021-12-02T08:59:38Z

✅ Gradle Wrapper Validation success a439a4d

opensearch-ci-bot · 2021-12-02T09:01:06Z

❌ Gradle Precommit failure a439a4d
Log 1683

opensearch-ci-bot · 2021-12-02T09:04:39Z

❌ Gradle Check failure a439a4d
Log 1298

Reports 1298

Bukhtawar · 2021-12-02T12:24:31Z

server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java

@@ -283,6 +287,10 @@ public void innerOnResponse(Result result) {
                            } finally {
                                executeNext(pendingExecutions, thread);
                            }
+                            ThreadMXBean threadMXBean = (ThreadMXBean) ManagementFactory.getThreadMXBean();


This is expensive. Move this to a static final variable

reta · 2021-12-02T13:54:37Z

server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java

@@ -283,6 +287,10 @@ public void innerOnResponse(Result result) {
                            } finally {
                                executeNext(pendingExecutions, thread);
                            }
+                            ThreadMXBean threadMXBean = (ThreadMXBean) ManagementFactory.getThreadMXBean();
+                            long bytes = threadMXBean.getThreadAllocatedBytes(Thread.currentThread().getId());


Also, it is worth checking if thread allocation tracking is enabled & supported, for not doing unnecessary work: ThreadMXBean::isThreadAllocatedMemorySupported() and ThreadMXBean::isThreadAllocatedMemoryEnabled()

Bukhtawar · 2021-12-02T12:25:45Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+     */
+    private TaskResourceTracker() {
+        taskMap = new ConcurrentHashMap<>();
+        threadMXBean = (ThreadMXBean) ManagementFactory.getThreadMXBean();


static final

... and we could remove direct usage of theThreadMXBean from all other places, just fence it behind TaskResourceTracker

Bukhtawar · 2021-12-02T14:11:30Z

server/src/main/java/org/opensearch/tasks/TaskManager.java

+        // just register read operations
+        if (action.startsWith("indices:data/read")) {
+            if (threadContext.getTransient("TASK_ID") == null) {
+                threadContext.putTransient("TASK_ID", String.valueOf(task.getId()));
+
+                List<String> indices = new ArrayList<>();
+                if (request instanceof IndicesRequest) {
+                    indices = Arrays.asList(((IndicesRequest) request).indices());
+                }
+                // TODO Add shard id handling
+                TaskResourceTracker.getInstance().registerTaskForTracking(task.getId(), indices, null, action);
+            }
+        }
+


Why can't this be modelled as a TaskListener

Bukhtawar · 2021-12-02T14:12:20Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+import java.util.List;
+import java.util.concurrent.ConcurrentHashMap;
+
+public class TaskResourceTracker implements ResourceWatcher {


Please add java docs for all new classes

Bukhtawar · 2021-12-02T14:13:07Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+    public void registerTaskForTracking(long taskId, List<String> indices, ShardId shardId, String actionName) {
+        taskMap.put(new TaskInfoKey(taskId, indices, shardId, actionName), new ArrayList<>());
+    }
+
+    public void registerWorkerForTask(long taskId, long workerId, long cpuCurrent, long bytesCurrent, String threadpoolName) {
+        // TODO remove this after identifying cases where it can be true
+        if (taskMap.get(new TaskInfoKey(taskId)) == null) {
+            return;
+        }
+
+        TaskWorkerResourceUtilInfo taskWorkerResourceUtilInfo =
+            new TaskWorkerResourceUtilInfo(workerId, cpuCurrent, cpuCurrent, bytesCurrent, bytesCurrent,
+                true, threadpoolName);
+
+        taskMap.get(new TaskInfoKey(taskId)).add(taskWorkerResourceUtilInfo);
+    }


Same as above maybe model as a TaskListener?

Bukhtawar · 2021-12-02T14:20:21Z

server/src/main/java/org/opensearch/tasks/TaskInfoKey.java

+    public TaskInfoKey(long taskId, List<String> indices, ShardId shardId, String action) {
+        this.taskId = taskId;
+        this.indices = indices;
+        this.shardId = shardId;
+        this.action = action;
+    }


Task is so far not bound to a ShardId, this should be more generic

Bukhtawar · 2021-12-02T14:24:28Z

server/src/main/java/org/opensearch/transport/InboundHandler.java

+            ThreadMXBean threadMXBean = (ThreadMXBean) ManagementFactory.getThreadMXBean();
+            long bytesStart = threadMXBean.getThreadAllocatedBytes(Thread.currentThread().getId());
+


ThreadMXBean is spread all over the place. Lets simplify

Bukhtawar · 2021-12-02T14:25:11Z

server/src/main/java/org/opensearch/transport/InboundHandler.java

+
+//            long bytesEnd = threadMXBean.getThreadAllocatedBytes(Thread.currentThread().getId());
+            if (response instanceof SearchPhaseResult) {
+//                TaskResourceTracker.getInstance().registerResponseOverhead(((SearchPhaseResult) response).getShardSearchRequest().getParentTask().getId(), bytesEnd - bytesStart);
+                TaskResourceTracker.getInstance().registerResponseOverhead1(response, bytesStart);
+            }
+


Why place it in InboundHandler

Bukhtawar · 2021-12-02T14:27:06Z

server/src/main/java/org/opensearch/common/util/concurrent/ResourceRunnable.java

+import org.opensearch.ExceptionsHelper;
+import org.opensearch.tasks.TaskResourceTracker;
+
+public class ResourceRunnable extends AbstractRunnable implements WrappedRunnable {


Lets add java docs to explain how threads running cost is computed and associated with the corresponding task

Bukhtawar · 2021-12-02T15:22:25Z

server/src/main/java/org/opensearch/node/Node.java

@@ -431,6 +432,7 @@ protected Node(
            resourcesToClose.add(() -> ThreadPool.terminate(threadPool, 10, TimeUnit.SECONDS));
            final ResourceWatcherService resourceWatcherService = new ResourceWatcherService(settings, threadPool);
            resourcesToClose.add(resourceWatcherService);
+            resourceWatcherService.add(TaskResourceTracker.getInstance(), ResourceWatcherService.Frequency.HIGH);


We don't need to watch per 5s, seems wasteful. Instead it should be tied to the overall usage subject to high utilization beyond a threshold or individual task level resource utilization or on-demand

Bukhtawar · 2021-12-02T15:33:37Z

server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java

+
+                            TaskResourceTracker.getInstance().transfer(task.getId(), result, bytes);


It would be too hard to maintain the code base with this construct. Lets simplify

Bukhtawar · 2021-12-02T15:42:59Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+            if ("tw".equals(taskWorkerResourceUtilInfo.getThreadPoolName())) {
+                response[0] += taskWorkerResourceUtilInfo.getOverheardBytes();
+            } else {
+                search[0] += taskWorkerResourceUtilInfo.getHeapNow() - taskWorkerResourceUtilInfo.getHeapStart();
+            }


transport worker is a very critical thread. I'll advice against any custom logic for it.

Bukhtawar · 2021-12-02T15:44:46Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+    public void transfer(long taskId, Object ob, long bytes) {
+        TaskInfoKey key = new TaskInfoKey(taskId);
+        if (!overhead.containsKey(ob) || taskMap.get(key) == null) return;
+
+        long bytesStart = overhead.get(ob);
+
+        TaskWorkerResourceUtilInfo t = new TaskWorkerResourceUtilInfo(1L, 0L, 0L, bytesStart, bytes, false, "tw");
+        taskMap.get(key).add(t);
+        overhead.remove(ob);
+    }


Please elaborate on this logic.

Bukhtawar · 2021-12-02T15:48:36Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+        taskMap.forEach((taskInfoKey, taskWorkerResourceUtilInfos) -> taskWorkerResourceUtilInfos.forEach(taskWorkerResourceUtilInfo -> {
+            if (taskWorkerResourceUtilInfo.isActive()) {
+                taskWorkerResourceUtilInfo.setHeapNow(threadMXBean.getThreadAllocatedBytes(taskWorkerResourceUtilInfo.getWorkerId()));
+            }
+            if ("tw".equals(taskWorkerResourceUtilInfo.getThreadPoolName())) {
+                response[0] += taskWorkerResourceUtilInfo.getOverheardBytes();
+            } else {
+                search[0] += taskWorkerResourceUtilInfo.getHeapNow() - taskWorkerResourceUtilInfo.getHeapStart();


Use the optimized batch API instead
https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/ThreadMXBean.html#getThreadAllocatedBytes(long[])

Bukhtawar

Suggest you use the below static assignment for optimizing the overhead of ThreadMXBean

+    static {
+        threadMXBean = ManagementFactory.getThreadMXBean();
+        Method getBytes;
+        try {
+            getBytes = threadMXBean.getClass()
+                    .getMethod("getThreadAllocatedBytes", long[].class);
+            getBytes.setAccessible(true);
+        } catch (NoSuchMethodException e) {
+            getBytes = null;
+        }
+        getThreadAllocatedBytes = getBytes;
+    }

reta · 2021-12-02T16:11:15Z

server/src/main/java/org/opensearch/tasks/TaskManager.java

@@ -150,6 +153,20 @@ public Task register(String type, String action, TaskAwareRequest request) {
            logger.trace("register {} [{}] [{}] [{}]", task.getId(), type, action, task.getDescription());
        }

+        // just register read operations
+        if (action.startsWith("indices:data/read")) {


I would suggest to enrich action with something like isResourceTrackingEnabled() and use it as an indicator of the need to track resources. Also, the tracking key (in this case, indices, but is action specific) has to be provided by the action as as well, fe as getResourceTrackingKey method.

Bukhtawar · 2021-12-02T16:15:07Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+    public void registerWorkerForTask(long taskId, long workerId, long cpuCurrent, long bytesCurrent, String threadpoolName) {
+        // TODO remove this after identifying cases where it can be true
+        if (taskMap.get(new TaskInfoKey(taskId)) == null) {
+            return;
+        }
+
+        TaskWorkerResourceUtilInfo taskWorkerResourceUtilInfo =
+            new TaskWorkerResourceUtilInfo(workerId, cpuCurrent, cpuCurrent, bytesCurrent, bytesCurrent,
+                true, threadpoolName);
+
+        taskMap.get(new TaskInfoKey(taskId)).add(taskWorkerResourceUtilInfo);
+    }


There could be concurrency bugs with these checks between the two gets

reta · 2021-12-02T16:15:53Z

server/src/main/java/org/opensearch/tasks/TaskResourceTracker.java

+        TaskInfoKey key = new TaskInfoKey(taskId);
+        if (!overhead.containsKey(ob) || taskMap.get(key) == null) return;
+
+        long bytesStart = overhead.get(ob);


NPE here, if the key gets removed in between check and usage. Use Long instead

reta · 2021-12-02T16:18:14Z

server/src/main/java/org/opensearch/threadpool/ThreadPoolStats.java

@@ -74,6 +88,8 @@ public Stats(StreamInput in) throws IOException {
            rejected = in.readLong();
            largest = in.readInt();
            completed = in.readLong();
+            bytes = in.readLong();


Please wrap in version checks:

if (in.getVersion().onOrAfter(Version.V_2_0_0)) { bytes = in.readLong(); ro = in.readLong(); }

reta · 2021-12-02T16:18:32Z

server/src/main/java/org/opensearch/threadpool/ThreadPoolStats.java

@@ -85,6 +101,8 @@ public void writeTo(StreamOutput out) throws IOException {
            out.writeLong(rejected);
            out.writeInt(largest);
            out.writeLong(completed);
+            out.writeLong(bytes);


if (in.getVersion().onOrAfter(Version.V_2_0_0)) { out.writeLong(bytes); out.writeLong(ro); }

Bukhtawar

Thanks Tushar, this is good stuff, I like the thought you have put into this espl this being your first contribution.

tushar-kharbanda72 · 2021-12-05T08:42:14Z

Thanks @reta and @Bukhtawar for taking out time and reviewing this PoC code. I have started thinking more on the final design for this and divided this into 4 to 5 meaningful chunks. Will start raising PRs for prod ready code from next week (making sure these comments are addressed as well).

dblock · 2022-03-21T18:58:52Z

@tushar-kharbanda72 Want to finish this?

tushar-kharbanda72 · 2022-03-29T14:29:37Z

@tushar-kharbanda72 Want to finish this?

@dblock Implemented this feature. Closing this POC PR.

Feature implementation PRs:

PR 1: #2089
PR 2: #2639

Search Task Resource Tracking PoC

a439a4d

tushar-kharbanda72 requested a review from a team as a code owner December 2, 2021 08:54

Bukhtawar reviewed Dec 2, 2021

View reviewed changes

reta reviewed Dec 2, 2021

View reviewed changes

Bukhtawar reviewed Dec 2, 2021

View reviewed changes

reta reviewed Dec 2, 2021

View reviewed changes

Bukhtawar reviewed Dec 2, 2021

View reviewed changes

reta reviewed Dec 2, 2021

View reviewed changes

Bukhtawar mentioned this pull request Dec 2, 2021

Track resource consumption for query and fetch phases #1575

Closed

5 tasks

Bukhtawar reviewed Dec 2, 2021

View reviewed changes

tushar-kharbanda72 closed this Mar 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search Task Resource Tracking PoC #1643

Search Task Resource Tracking PoC #1643

tushar-kharbanda72 commented Dec 2, 2021 •

edited

Loading

opensearch-ci-bot commented Dec 2, 2021

opensearch-ci-bot commented Dec 2, 2021

opensearch-ci-bot commented Dec 2, 2021

opensearch-ci-bot commented Dec 2, 2021

Bukhtawar Dec 2, 2021

reta Dec 2, 2021

Bukhtawar Dec 2, 2021

reta Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021 •

edited

Loading

Bukhtawar Dec 2, 2021 •

edited

Loading

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar Dec 2, 2021

Bukhtawar left a comment

reta Dec 2, 2021 •

edited

Loading

Bukhtawar Dec 2, 2021

reta Dec 2, 2021

reta Dec 2, 2021

reta Dec 2, 2021

Bukhtawar left a comment

tushar-kharbanda72 commented Dec 5, 2021

dblock commented Mar 21, 2022

tushar-kharbanda72 commented Mar 29, 2022

		ThreadMXBean threadMXBean = (ThreadMXBean) ManagementFactory.getThreadMXBean();
		long bytesStart = threadMXBean.getThreadAllocatedBytes(Thread.currentThread().getId());


		TaskResourceTracker.getInstance().transfer(task.getId(), result, bytes);

Search Task Resource Tracking PoC #1643

Search Task Resource Tracking PoC #1643

Conversation

tushar-kharbanda72 commented Dec 2, 2021 • edited Loading

Description

Issues Resolved

Check List

opensearch-ci-bot commented Dec 2, 2021

opensearch-ci-bot commented Dec 2, 2021

opensearch-ci-bot commented Dec 2, 2021

opensearch-ci-bot commented Dec 2, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bukhtawar Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

Bukhtawar Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bukhtawar left a comment

Choose a reason for hiding this comment

reta Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bukhtawar left a comment

Choose a reason for hiding this comment

tushar-kharbanda72 commented Dec 5, 2021

dblock commented Mar 21, 2022

tushar-kharbanda72 commented Mar 29, 2022

tushar-kharbanda72 commented Dec 2, 2021 •

edited

Loading

Bukhtawar Dec 2, 2021 •

edited

Loading

Bukhtawar Dec 2, 2021 •

edited

Loading

reta Dec 2, 2021 •

edited

Loading