Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add search backpressure cancellation at the coordinator level #5173

Closed
PritLadani opened this issue Nov 9, 2022 · 5 comments
Closed

Add search backpressure cancellation at the coordinator level #5173

PritLadani opened this issue Nov 9, 2022 · 5 comments
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search

Comments

@PritLadani
Copy link
Contributor

PritLadani commented Nov 9, 2022

Is your feature request related to a problem? Please describe.
#1042 aims to build back-pressure support for search requests. As a part of #4575, we have already added cancellation for SearchShardTasks based on resource consumption. This feature aims to cancel the resource guzzling queries. As a part of #3982, we are already tracking the resource consumption of SearchTasks, using which we will make cancellation decision for a query.

Describe the solution you'd like
Cancelling on-going most resource intensive search requests on a coordinator node based on the resource consumption of SearchTask, if the resource limits for that node have started breaching the assigned limits, and there is no recovery for a certain time threshold. The back-pressure model should support identification of queries which are most resource guzzling with minimal wasteful work. Moreover, if partial results is not enabled for a query, cancellation of parent task will result into cancellation of all children tasks as well.

Describe alternatives you've considered
Another alternative we have considered is, rather than only considering resource stats from the parent task, we can consider the resource consumption by children tasks as well. However, by just looking at the children task consumptions, we cannot correctly estimate the resources required by the parent task and hence we will consider the resource consumption only by the parent task.

Additional context
Just by looking at the resource consumption or aggregating the resource stats of child tasks, we cannot get the estimate of resource consumption of the coordinator task. Hence we cannot estimate whether a search task will cause the node go in duress or not.

@PritLadani PritLadani added enhancement Enhancement or improvement to existing feature or request untriaged labels Nov 9, 2022
@PritLadani
Copy link
Contributor Author

PritLadani commented Dec 13, 2022

With the existing trackers for SearchShardTask and with minimum threshold values, parent task(SearchTask) is getting cancelled when the node is in duress. Below are the observed logs:

[2022-12-13T20:22:24,018][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task with id = 12121 cancellation reason = cpu usage exceeded [4.3ms >= 0s]
[2022-12-13T20:22:24,018][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task with id = 12121 cancellation reason = heap usage exceeded [1.3mb >= 1.2mb]
[2022-12-13T20:22:24,018][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task with id = 12121 cancellation reason = elapsed time exceeded [143ms >= 0s]
[2022-12-13T20:22:24,019][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] getTaskCancellation task = org.opensearch.action.search.SearchShardTask@63f93972
[2022-12-13T20:22:24,019][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] getTaskCancellation task = org.opensearch.action.search.SearchShardTask@9386617
[2022-12-13T20:22:24,019][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] [enforced mode] cancelling task [12124] due to high resource consumption [cpu usage exceeded [83.4ms >= 0s], heap usage exceeded [29.8mb >= 1.2mb], elapsed time exceeded [88ms >= 0s]]
[2022-12-13T20:22:24,019][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] [enforced mode] cancelling task [12125] due to high resource consumption [cpu usage exceeded [57.9ms >= 0s], heap usage exceeded [16.1mb >= 1.2mb], elapsed time exceeded [86.9ms >= 0s]]
[2022-12-13T20:22:24,020][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] [enforced mode] cancelling task [12121] due to high resource consumption [cpu usage exceeded [4.3ms >= 0s], heap usage exceeded [1.3mb >= 1.2mb], elapsed time exceeded [143ms >= 0s]]
[2022-12-13T20:22:24,020][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task is eligible for cancellation

[2022-12-13T20:23:00,124][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task with id = 12138 cancellation reason = cpu usage exceeded [5.1ms >= 0s]
[2022-12-13T20:23:00,124][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task with id = 12138 cancellation reason = elapsed time exceeded [128.9ms >= 0s]
[2022-12-13T20:23:00,124][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] getTaskCancellation task = org.opensearch.action.search.SearchShardTask@7eb8078d
[2022-12-13T20:23:00,125][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] getTaskCancellation task = org.opensearch.action.search.SearchShardTask@c58674d
[2022-12-13T20:23:00,125][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] [enforced mode] cancelling task [12141] due to high resource consumption [cpu usage exceeded [69.1ms >= 0s], heap usage exceeded [22.2mb >= 1.7mb], elapsed time exceeded [70.7ms >= 0s]]
[2022-12-13T20:23:00,125][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] [enforced mode] cancelling task [12142] due to high resource consumption [cpu usage exceeded [61.5ms >= 0s], heap usage exceeded [19.1mb >= 1.7mb], elapsed time exceeded [69.6ms >= 0s]]
[2022-12-13T20:23:00,125][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] [enforced mode] cancelling task [12138] due to high resource consumption [cpu usage exceeded [5.1ms >= 0s], elapsed time exceeded [128.9ms >= 0s]]
[2022-12-13T20:23:00,125][TRACE][o.o.t.TaskManager        ] [runTask-0] unregister task for id: 12141
[2022-12-13T20:23:00,125][INFO ][o.o.s.b.SearchBackpressureService] [runTask-0] parent task is eligible for cancellation

Will introduce new trackers and define new thresholds for parent task cancellation.

@PritLadani
Copy link
Contributor Author

PritLadani commented Dec 21, 2022

Introduced these settings which will be dynamically configurable:

Setting Default Description
search_backpressure.search_task.total_heap_percent_threshold 5% The heap usage threshold (as a percentage) required for the sum of heap usages of all search tasks before cancellation is applied.
search_backpressure.search_task.cpu_time_millis_threshold_for_search_query 60000 Defines the CPU usage threshold (in millis) for an individual parent task before it is considered for cancellation.
search_backpressure.search_task.elapsed_time_millis_threshold_for_search_query 120000 Defines the elapsed time threshold (in millis) for an individual parent task before it is considered for cancellation.
search_backpressure.search_task.heap_percent_threshold_for_search_query 2% Defines the heap usage threshold (in percentage) for an individual parent task before it is considered for cancellation.
search_backpressure.search_task.heap_variance_for_search_query 2.0 Defines the heap usage variance for an individual parent task before it is considered for cancellation. A task is considered for cancellation when taskHeapUsage is greater than or equal to heapUsageMovingAverage * variance.
search_backpressure.search_task.heap_moving_average_window_size_for_search_query 10 Defines the window size to calculate the moving average of heap usage of completed parent tasks

After adding the above settings, /_cluster/settings response looks like this:

{
    "acknowledged": true,
    "persistent": {
        "search_backpressure": {
            "mode": "enforced",
            "cancellation_burst": "10",
            "cancellation_ratio": "1",
            "cancellation_rate": "10",
            "search_task": {
                "elapsed_time_millis_threshold_for_search_query": "0",
                "heap_variance_for_search_query": "1.0",
                "total_heap_percent_threshold": "0.0",
                "heap_moving_average_window_size_for_search_query": "1",
                "cpu_time_millis_threshold_for_search_query": "0",
                "heap_percent_threshold_for_search_query": "0.0"
            },
            "node_duress": {
                "cpu_threshold": "0.0",
                "heap_threshold": "0.0",
                "num_successive_breaches": "1"
            },
            "search_shard_task": {
                "elapsed_time_millis_threshold": "0",
                "heap_variance": "1.0",
                "heap_percent_threshold": "0.0",
                "total_heap_percent_threshold": "0.0",
                "heap_moving_average_window_size": "1",
                "cpu_time_millis_threshold": "0"
            }
        }
    },
    "transient": {}
}

@PritLadani
Copy link
Contributor Author

PritLadani commented Dec 21, 2022

Added search_task stats in the response of _nodes/stats/search_backpressure:

{
    "_nodes": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "cluster_name": "runTask",
    "nodes": {
        "6F72foSsSKOLwYVcaBDUnQ": {
            "timestamp": 1671601861265,
            "name": "runTask-0",
            "transport_address": "127.0.0.1:9300",
            "host": "127.0.0.1",
            "ip": "127.0.0.1:9300",
            "roles": [
                "cluster_manager",
                "data",
                "ingest",
                "remote_cluster_client"
            ],
            "attributes": {
                "testattr": "test",
                "shard_indexing_pressure_enabled": "true"
            },
            "search_backpressure": {
                "search_task": {
                    "resource_tracker_stats": {
                        "elapsed_time_tracker": {
                            "cancellation_count": 8,
                            "current_max_millis": 0,
                            "current_avg_millis": 0
                        },
                        "heap_usage_tracker": {
                            "cancellation_count": 4,
                            "current_max_bytes": 0,
                            "current_avg_bytes": 0,
                            "rolling_avg_bytes": 205568
                        },
                        "cpu_usage_tracker": {
                            "cancellation_count": 8,
                            "current_max_millis": 0,
                            "current_avg_millis": 0
                        }
                    },
                    "cancellation_stats": {
                        "cancellation_count": 8,
                        "cancellation_limit_reached_count": 0
                    }
                },
                "search_shard_task": {
                    "resource_tracker_stats": {
                        "elapsed_time_tracker": {
                            "cancellation_count": 12,
                            "current_max_millis": 0,
                            "current_avg_millis": 0
                        },
                        "heap_usage_tracker": {
                            "cancellation_count": 10,
                            "current_max_bytes": 0,
                            "current_avg_bytes": 0,
                            "rolling_avg_bytes": 86032480
                        },
                        "cpu_usage_tracker": {
                            "cancellation_count": 12,
                            "current_max_millis": 0,
                            "current_avg_millis": 0
                        }
                    },
                    "cancellation_stats": {
                        "cancellation_count": 12,
                        "cancellation_limit_reached_count": 0
                    }
                },
                "mode": "enforced"
            }
        }
    }
}

search_task - Section which would contain the stats related to SearchTasks
search_shard_task - Section which would contain the stats related to SearchShardTasks
resource_tracker_stats - Section which would contain the resource tracker stats from different resource usage trackers for each task type
elapsed_time_tracker - Section which would contain the stats related to elapsed time for each task type
heap_usage_tracker - Section which would contain the stats related to heap usage for each task type
cpu_usage_tracker - Section which would contain the stats related to CPU usage for each task type
cancellation_stats - Section which would contain the cancellation stats for each task type
cancellation_count - Count of task cancellations done on that node till now from the time process started
cancellation_limit_reached_count - Number of iterations when there were more eligible tasks available to be cancelled, then the permitted cancellation threshold
mode - Search backpressure mode

@PritLadani PritLadani changed the title Cancellation of In-flight Search Requests at Coordinator Level Add search backpressure cancellation at the coordinator level Feb 3, 2023
@PritLadani
Copy link
Contributor Author

After multiple iterations, we have introduced these settings which will be dynamically configurable:

Setting Default Description
search_backpressure.search_task.total_heap_percent_threshold 5% The heap usage threshold (as a percentage) required for the sum of heap usages of all search tasks before cancellation is applied.
search_backpressure.search_task.cpu_time_millis_threshold 30000 Defines the CPU usage threshold (in millis) for an individual parent task before it is considered for cancellation.
search_backpressure.search_task.elapsed_time_millis_threshold 45000 Defines the elapsed time threshold (in millis) for an individual parent task before it is considered for cancellation.
search_backpressure.search_task.heap_percent_threshold 2% Defines the heap usage threshold (in percentage) for an individual parent task before it is considered for cancellation.
search_backpressure.search_task.heap_variance 2.0 Defines the heap usage variance for an individual parent task before it is considered for cancellation. A task is considered for cancellation when taskHeapUsage is greater than or equal to heapUsageMovingAverage * variance.
search_backpressure.search_task.heap_moving_average_window_size 100 Defines the window size to calculate the moving average of heap usage of completed parent tasks
search_backpressure.search_task.cancellation_ratio 0.1 The maximum number of SearchTasks to cancel, as a percentage of successful SearchTask completions.
search_backpressure.search_task.cancellation_rate 0.003 The maximum number of SearchTasks to cancel per millisecond of elapsed time.
search_backpressure.search_task.cancellation_burst 5 The maximum number of SearchTasks to cancel in a single iteration of the observer thread.

In addition to the above settings, we have also deprecated a few settings as mentioned below:

Setting
search_backpressure.cancellation_ratio
search_backpressure.cancellation_rate
search_backpressure.cancellation_burst

We have also introduced replacement settings for the above settings:

Setting Default Description
search_backpressure.search_shard_task.cancellation_ratio 0.1 The maximum number of SearchShardTasks to cancel, as a percentage of successful SearchShardTasks completions.
search_backpressure.search_shard_task.cancellation_rate 0.003 The maximum number of SearchShardTasks to cancel per millisecond of elapsed time.
search_backpressure.search_shard_task.cancellation_burst 10 The maximum number of SearchShardTasks to cancel in a single iteration of the observer thread.

After adding the above settings, /_cluster/settings response looks like this:

{
    "acknowledged": true,
    "persistent": {
        "search_backpressure": {
            "mode": "monitor_only",
            "cancellation_burst": "10.0",
            "cancellation_ratio": "0.1",
            "cancellation_rate": "0.003",
            "search_task": {
                "elapsed_time_millis_threshold": "45000",
                "heap_variance": "2.0",
                "heap_percent_threshold": "0.02",
                "cancellation_burst": "5.0",
                "cpu_time_millis_threshold": "30000",
                "cancellation_ratio": "0.1",
                "cancellation_rate": "0.003",
                "total_heap_percent_threshold": "0.05",
                "heap_moving_average_window_size": "100"
            },
            "node_duress": {
                "cpu_threshold": "0.9",
                "heap_threshold": "0.7",
                "num_successive_breaches": "3"
            },
            "search_shard_task": {
                "elapsed_time_millis_threshold": "30000",
                "heap_variance": "2.0",
                "heap_percent_threshold": "0.005",
                "cancellation_burst": "10.0",
                "cpu_time_millis_threshold": "15000",
                "cancellation_ratio": "0.1",
                "cancellation_rate": "0.003",
                "total_heap_percent_threshold": "0.05",
                "heap_moving_average_window_size": "100"
            }
        }
    },
    "transient": {}
}

@PritLadani
Copy link
Contributor Author

This is fixed as a part of #5605.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search
Projects
None yet
Development

No branches or pull requests

3 participants