From 9386dbc0cbb802719148272f3b9ba648d3d29150 Mon Sep 17 00:00:00 2001 From: Chenyang Ji Date: Thu, 31 Oct 2024 13:34:47 -0700 Subject: [PATCH] add document for Query Insights health_stats API (#8627) * add document for Query Insights health_stats API Signed-off-by: Chenyang Ji * Doc review Signed-off-by: Fanit Kolchina * Update _observing-your-data/query-insights/api.md Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Move metrics counters section Signed-off-by: Fanit Kolchina * Clarification Signed-off-by: Fanit Kolchina * Change title of page Signed-off-by: Fanit Kolchina * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --------- Signed-off-by: Chenyang Ji Signed-off-by: Fanit Kolchina Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Fanit Kolchina Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nathan Bower --- _observing-your-data/query-insights/health.md | 119 ++++++++++++++++++ _observing-your-data/query-insights/index.md | 4 + 2 files changed, 123 insertions(+) create mode 100644 _observing-your-data/query-insights/health.md diff --git a/_observing-your-data/query-insights/health.md b/_observing-your-data/query-insights/health.md new file mode 100644 index 0000000000..02324d8237 --- /dev/null +++ b/_observing-your-data/query-insights/health.md @@ -0,0 +1,119 @@ +--- +layout: default +title: Query Insights plugin health +parent: Query insights +nav_order: 50 +--- + +# Query Insights plugin health + +The Query Insights plugin provides an [API](#health-stats-api) and [metrics](#opentelemetry-error-metrics-counters) for monitoring its health and performance, enabling proactive identification of issues that may affect query processing or system resources. + +## Health Stats API +**Introduced 2.18** +{: .label .label-purple } + +The Health Stats API provides health metrics for each node running the Query Insights plugin. These metrics allow for an in-depth view of resource usage and the health of the query processing components. + +### Path and HTTP methods + +```json +GET _insights/health_stats +``` + +### Example request + +```json +GET _insights/health_stats +``` +{% include copy-curl.html %} + +### Example response + +The response includes a set of health-related fields for each node: + +```json +PUT _cluster/settings +{ + "AqegbPL0Tv2XWvZV4PTS8Q": { + "ThreadPoolInfo": { + "query_insights_executor": { + "type": "scaling", + "core": 1, + "max": 5, + "keep_alive": "5m", + "queue_size": 2 + } + }, + "QueryRecordsQueueSize": 2, + "TopQueriesHealthStats": { + "latency": { + "TopQueriesHeapSize": 5, + "QueryGroupCount_Total": 0, + "QueryGroupCount_MaxHeap": 0 + }, + "memory": { + "TopQueriesHeapSize": 5, + "QueryGroupCount_Total": 0, + "QueryGroupCount_MaxHeap": 0 + }, + "cpu": { + "TopQueriesHeapSize": 5, + "QueryGroupCount_Total": 0, + "QueryGroupCount_MaxHeap": 0 + } + } + } +} +``` + +### Response fields + +The following table lists all response body fields. + +Field | Data type | Description +:--- |:---| :--- +`ThreadPoolInfo` | Object | Information about the Query Insights thread pool, including type, core count, max threads, and queue size. See [The ThreadPoolInfo object](#the-threadpoolinfo-object). +`QueryRecordsQueueSize` | Integer | The size of the queue that buffers incoming search queries before processing. A high value may suggest increased load or slower processing. +`TopQueriesHealthStats` | Object | Performance metrics for each top query service that provide information about memory allocation (heap size) and query grouping. See [The TopQueriesHealthStats object](#the-topquerieshealthstats-object). + +### The ThreadPoolInfo object + +The `ThreadPoolInfo` object contains the following detailed configuration and performance data for the thread pool dedicated to the Query Insights plugin. + +Field | Data type | Description +:--- |:---| :--- +`type`| String | The thread pool type (for example, `scaling`). +`core`| Integer | The minimum number of threads in the thread pool. +`max`| Integer | The maximum number of threads in the thread pool. +`keep_alive`| Time unit | The amount of time that idle threads are retained. +`queue_size`| Integer | The maximum number of tasks in the queue. + +### The TopQueriesHealthStats object + +The `TopQueriesHealthStats` object provides breakdowns for latency, memory, and CPU usage and contains the following information. + +Field | Data type | Description +:--- |:---| :--- +`TopQueriesHeapSize`| Integer | The heap memory allocation for the query group. +`QueryGroupCount_Total`| Integer | The total number of processed query groups. +`QueryGroupCount_MaxHeap`| Integer | The size of the max heap that stores all query groups in memory. + +## OpenTelemetry error metrics counters + +The Query Insights plugin integrates with OpenTelemetry to provide real-time error metrics counters. These counters help to identify specific operational failures in the plugin and improve reliability. Each metric provides targeted insights into potential error sources in the plugin workflow, allowing for more focused debugging and maintenance. + +To collect these metrics, you must configure and collect query metrics. For more information, see [Query metrics]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/query-metrics/). + +The following table lists all available metrics. + +Field | Description +:--- | :--- +`LOCAL_INDEX_READER_PARSING_EXCEPTIONS` | The number of errors that occur when parsing data using the LocalIndexReader. +`LOCAL_INDEX_EXPORTER_BULK_FAILURES` | The number of failures that occur when ingesting Query Insights plugin data into local indexes. +`LOCAL_INDEX_EXPORTER_EXCEPTIONS` | The number of exceptions that occur in the Query Insights plugin LocalIndexExporter. +`INVALID_EXPORTER_TYPE_FAILURES` | The number of invalid exporter type failures. +`INVALID_INDEX_PATTERN_EXCEPTIONS` | The number of invalid index pattern exceptions. +`DATA_INGEST_EXCEPTIONS` | The number of exceptions that occur when ingesting data into the Query Insights plugin. +`QUERY_CATEGORIZE_EXCEPTIONS` | The number of exceptions that occur when categorizing the queries. +`EXPORTER_FAIL_TO_CLOSE_EXCEPTION` | The number of failures that occur when closing the exporter. \ No newline at end of file diff --git a/_observing-your-data/query-insights/index.md b/_observing-your-data/query-insights/index.md index 7b2341ba54..4bba866c05 100644 --- a/_observing-your-data/query-insights/index.md +++ b/_observing-your-data/query-insights/index.md @@ -42,3 +42,7 @@ You can obtain the following information using Query Insights: - [Top n queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/top-n-queries/) - [Grouping top N queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/grouping-top-n-queries/) - [Query metrics]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/query-metrics/) + +## Query Insights plugin health + +For information about monitoring the health of the Query Insights plugin, see [Query Insights plugin health]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/health/). \ No newline at end of file