Skip to content

Commit

Permalink
add document for Query Insights health_stats API (opensearch-project#…
Browse files Browse the repository at this point in the history
…8627)

* add document for Query Insights health_stats API

Signed-off-by: Chenyang Ji <cyji@amazon.com>

* Doc review

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _observing-your-data/query-insights/api.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Move metrics counters section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Clarification

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change title of page

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

---------

Signed-off-by: Chenyang Ji <cyji@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
  • Loading branch information
4 people authored Oct 31, 2024
1 parent ea3f786 commit 9386dbc
Show file tree
Hide file tree
Showing 2 changed files with 123 additions and 0 deletions.
119 changes: 119 additions & 0 deletions _observing-your-data/query-insights/health.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
layout: default
title: Query Insights plugin health
parent: Query insights
nav_order: 50
---

# Query Insights plugin health

The Query Insights plugin provides an [API](#health-stats-api) and [metrics](#opentelemetry-error-metrics-counters) for monitoring its health and performance, enabling proactive identification of issues that may affect query processing or system resources.

## Health Stats API
**Introduced 2.18**
{: .label .label-purple }

The Health Stats API provides health metrics for each node running the Query Insights plugin. These metrics allow for an in-depth view of resource usage and the health of the query processing components.

### Path and HTTP methods

```json
GET _insights/health_stats
```

### Example request

```json
GET _insights/health_stats
```
{% include copy-curl.html %}

### Example response

The response includes a set of health-related fields for each node:

```json
PUT _cluster/settings
{
"AqegbPL0Tv2XWvZV4PTS8Q": {
"ThreadPoolInfo": {
"query_insights_executor": {
"type": "scaling",
"core": 1,
"max": 5,
"keep_alive": "5m",
"queue_size": 2
}
},
"QueryRecordsQueueSize": 2,
"TopQueriesHealthStats": {
"latency": {
"TopQueriesHeapSize": 5,
"QueryGroupCount_Total": 0,
"QueryGroupCount_MaxHeap": 0
},
"memory": {
"TopQueriesHeapSize": 5,
"QueryGroupCount_Total": 0,
"QueryGroupCount_MaxHeap": 0
},
"cpu": {
"TopQueriesHeapSize": 5,
"QueryGroupCount_Total": 0,
"QueryGroupCount_MaxHeap": 0
}
}
}
}
```

### Response fields

The following table lists all response body fields.

Field | Data type | Description
:--- |:---| :---
`ThreadPoolInfo` | Object | Information about the Query Insights thread pool, including type, core count, max threads, and queue size. See [The ThreadPoolInfo object](#the-threadpoolinfo-object).
`QueryRecordsQueueSize` | Integer | The size of the queue that buffers incoming search queries before processing. A high value may suggest increased load or slower processing.
`TopQueriesHealthStats` | Object | Performance metrics for each top query service that provide information about memory allocation (heap size) and query grouping. See [The TopQueriesHealthStats object](#the-topquerieshealthstats-object).

### The ThreadPoolInfo object

The `ThreadPoolInfo` object contains the following detailed configuration and performance data for the thread pool dedicated to the Query Insights plugin.

Field | Data type | Description
:--- |:---| :---
`type`| String | The thread pool type (for example, `scaling`).
`core`| Integer | The minimum number of threads in the thread pool.
`max`| Integer | The maximum number of threads in the thread pool.
`keep_alive`| Time unit | The amount of time that idle threads are retained.
`queue_size`| Integer | The maximum number of tasks in the queue.

### The TopQueriesHealthStats object

The `TopQueriesHealthStats` object provides breakdowns for latency, memory, and CPU usage and contains the following information.

Field | Data type | Description
:--- |:---| :---
`TopQueriesHeapSize`| Integer | The heap memory allocation for the query group.
`QueryGroupCount_Total`| Integer | The total number of processed query groups.
`QueryGroupCount_MaxHeap`| Integer | The size of the max heap that stores all query groups in memory.

## OpenTelemetry error metrics counters

The Query Insights plugin integrates with OpenTelemetry to provide real-time error metrics counters. These counters help to identify specific operational failures in the plugin and improve reliability. Each metric provides targeted insights into potential error sources in the plugin workflow, allowing for more focused debugging and maintenance.

To collect these metrics, you must configure and collect query metrics. For more information, see [Query metrics]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/query-metrics/).

The following table lists all available metrics.

Field | Description
:--- | :---
`LOCAL_INDEX_READER_PARSING_EXCEPTIONS` | The number of errors that occur when parsing data using the LocalIndexReader.
`LOCAL_INDEX_EXPORTER_BULK_FAILURES` | The number of failures that occur when ingesting Query Insights plugin data into local indexes.
`LOCAL_INDEX_EXPORTER_EXCEPTIONS` | The number of exceptions that occur in the Query Insights plugin LocalIndexExporter.
`INVALID_EXPORTER_TYPE_FAILURES` | The number of invalid exporter type failures.
`INVALID_INDEX_PATTERN_EXCEPTIONS` | The number of invalid index pattern exceptions.
`DATA_INGEST_EXCEPTIONS` | The number of exceptions that occur when ingesting data into the Query Insights plugin.
`QUERY_CATEGORIZE_EXCEPTIONS` | The number of exceptions that occur when categorizing the queries.
`EXPORTER_FAIL_TO_CLOSE_EXCEPTION` | The number of failures that occur when closing the exporter.
4 changes: 4 additions & 0 deletions _observing-your-data/query-insights/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,7 @@ You can obtain the following information using Query Insights:
- [Top n queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/top-n-queries/)
- [Grouping top N queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/grouping-top-n-queries/)
- [Query metrics]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/query-metrics/)

## Query Insights plugin health

For information about monitoring the health of the Query Insights plugin, see [Query Insights plugin health]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/health/).

0 comments on commit 9386dbc

Please sign in to comment.