Skip to content

Commit

Permalink
Add concurrent segment search documentation (#4990)
Browse files Browse the repository at this point in the history
* Add concurrent segment search documentation

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API information

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Small rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add tech review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Fix link

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add API section

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Small rewriting

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Rewriting

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Tech review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update profile.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update _api-reference/profile.md

Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Update concurrent-segment-search.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Implemented editorial comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Melissa Vagi <vagimeli@amazon.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
  • Loading branch information
3 people committed Dec 20, 2023
1 parent 4d355c5 commit beb2182
Show file tree
Hide file tree
Showing 4 changed files with 396 additions and 6 deletions.
17 changes: 15 additions & 2 deletions _api-reference/index-apis/stats.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ parent: Index APIs
nav_order: 72
---

# Index stats
# Index Stats

The Index Stats API provides index statistics. For data streams, the API provides statistics for the stream's backing indexes. By default, the returned statistics are index level. To receive shard-level statistics, set the `level` parameter to `shards`.

Expand Down Expand Up @@ -792,4 +792,17 @@ GET /testindex*/_stats?expand_wildcards=open,hidden
```json
GET /testindex/_stats?level=shards
```
{% include copy-curl.html %}
{% include copy-curl.html %}

## Concurrent segment search

Starting in OpenSearch 2.10, [concurrent segment search]({{site.url}}{{site.baseurl}}/search-plugins/concurrent-segment-search/) allows each shard-level request to search segments in parallel during the query phase. If you [enable the experimental concurrent segment search feature flag]({{site.url}}{{site.baseurl}}/search-plugins/concurrent-segment-search#enabling-the-feature-flag), the Index Stats API response will contain several additional fields with statistics about slices (units of work executed by a thread). These fields will be provided whether or not the cluster and index settings for concurrent segment search are enabled. For more information about slices, see [Concurrent segment search]({{site.url}}{{site.baseurl}}/search-plugins/concurrent-segment-search#searching-segments-concurrently).

The following table provides information about the added response fields.

|Response field | Description |
|:--- |:--- |
|`search.concurrent_avg_slice_count` |The average slice count of all search requests. This is computed as the total slice count divided by the total number of concurrent search requests. |
|`search.concurrent_query_total` |The total number of query operations that use concurrent segment search. |
|`search.concurrent_query_time_in_millis` |The total amount of time taken by all query operations that use concurrent segment search, in milliseconds. |
|`search.concurrent_query_current` |The number of currently running query operations that use concurrent segment search. |
4 changes: 4 additions & 0 deletions _api-reference/nodes-apis/nodes-stats.md
Original file line number Diff line number Diff line change
Expand Up @@ -1140,6 +1140,10 @@ total_rejections_breakup_shadow_mode.throughput_degradation_limits | Integer | T
enabled | Boolean | Specifies whether the shard indexing pressure feature is turned on for the node.
enforced | Boolean | If true, the shard indexing pressure runs in enforced mode (there are rejections). If false, the shard indexing pressure runs in shadow mode (there are no rejections, but statistics are recorded and can be retrieved in the `total_rejections_breakup_shadow_mode` object). Only applicable if shard indexing pressure is enabled.

## Concurrent segment search

Starting in OpenSearch 2.10, [concurrent segment search]({{site.url}}{{site.baseurl}}/search-plugins/concurrent-segment-search/) allows each shard-level request to search segments in parallel during the query phase. If you [enable the experimental concurrent segment search feature flag]({{site.url}}{{site.baseurl}}/search-plugins/concurrent-segment-search#enabling-the-feature-flag), the Nodes Stats API response will contain several additional fields with statistics about slices (units of work executed by a thread). For the descriptions of those fields, see [Index Stats API]({{site.url}}{{site.baseurl}}/api-reference/index-apis/stats#concurrent-segment-search).

## Required permissions

If you use the Security plugin, make sure you have the appropriate permissions: `cluster:monitor/nodes/stats`.
219 changes: 215 additions & 4 deletions _api-reference/profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,14 +221,14 @@ Field | Data type | Description
`profile.shards` | Array of objects | A search request can be executed against one or more shards in the index, and a search may involve one or more indexes. Thus, the `profile.shards` array contains profiling information for each shard that was involved in the search.
`profile.shards.id` | String | The shard ID of the shard in the `[node-ID][index-name][shard-ID]` format.
`profile.shards.searches` | Array of objects | A search represents a query executed against the underlying Lucene index. Most search requests execute a single search against a Lucene index, but some search requests can execute more than one search. For example, including a global aggregation results in a secondary `match_all` query for the global context. The `profile.shards` array contains profiling information about each search execution.
[`profile.shards.searches.query`](#the-query-object) | Array of objects | Profiling information about the query execution.
[`profile.shards.searches.query`](#the-query-array) | Array of objects | Profiling information about the query execution.
`profile.shards.searches.rewrite_time` | Integer | All Lucene queries are rewritten. A query and its children may be rewritten more than once, until the query stops changing. The rewriting process involves performing optimizations, such as removing redundant clauses or replacing a query path with a more efficient one. After the rewriting process, the original query may change significantly. The `rewrite_time` field contains the cumulative total rewrite time for the query and all its children, in nanoseconds.
[`profile.shards.searches.collector`](#the-collector-array) | Array of objects | Profiling information about the Lucene collectors that ran the search.
[`profile.shards.aggregations`](#aggregations) | Array of objects | Profiling information about the aggregation execution.

### The `query` object
### The `query` array

The `query` object contains the following fields.
The `query` array contains objects with the following fields.

Field | Data type | Description
:--- | :--- | :---
Expand Down Expand Up @@ -753,4 +753,215 @@ Field | Description
`post_collection`| Contains the time spent running the aggregation’s `postCollection()` callback method.
`build_aggregation`| Contains the time spent running the aggregation’s `buildAggregations()` method, which builds the results of this aggregation.
`reduce`| Contains the time spent in the `reduce` phase.
`<method>_count` | Contains the number of invocations of a `<method>`. For example, `build_leaf_collector_count` contains the number of invocations of the `build_leaf_collector` method.
`<method>_count` | Contains the number of invocations of a `<method>`. For example, `build_leaf_collector_count` contains the number of invocations of the `build_leaf_collector` method.

## Concurrent segment search

Starting in OpenSearch 2.10, [concurrent segment search]({{site.url}}{{site.baseurl}}/search-plugins/concurrent-segment-search/) allows each shard-level request to search segments in parallel during the query phase. If you enable the experimental concurrent segment search feature flag, the Profile API response will contain several additional fields with statistics about _slices_.

A slice is the unit of work that can be executed by a thread. Each query can be partitioned into multiple slices, with each slice containing one or more segments. All the slices can be executed either in parallel or in some order depending on the available threads in the pool.

In general, the max/min/avg slice time captures statistics across all slices for a timing type. For example, when profiling aggregations, the `max_slice_time_in_nanos` field in the `aggregations` section shows the maximum time consumed by the aggregation operation and its children across all slices.

#### Example response

The following is an example response for a concurrent search with three segment slices:

<details closed markdown="block">
<summary>
Response
</summary>
{: .text-delta}

```json
{
"took": 76,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 1,
"hits": [
...
]
},
"aggregations": {
...
},
"profile": {
"shards": [
{
"id": "[Sn2zHhcMTRetEjXvppU8bA][idx][0]",
"inbound_network_time_in_millis": 0,
"outbound_network_time_in_millis": 0,
"searches": [
{
"query": [
{
"type": "MatchAllDocsQuery",
"description": "*:*",
"time_in_nanos": 429246,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 5485,
"match": 0,
"next_doc_count": 5,
"score_count": 5,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 3350,
"advance_count": 3,
"score": 5920,
"build_scorer_count": 6,
"create_weight": 429246,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 2221054
}
}
],
"rewrite_time": 12442,
"collector": [
{
"name": "QueryCollectorManager",
"reason": "search_multi",
"time_in_nanos": 6786930,
"reduce_time_in_nanos": 5892759,
"max_slice_time_in_nanos": 5951808,
"min_slice_time_in_nanos": 5798174,
"avg_slice_time_in_nanos": 5876588,
"slice_count": 3,
"children": [
{
"name": "SimpleTopDocsCollectorManager",
"reason": "search_top_hits",
"time_in_nanos": 1340186,
"reduce_time_in_nanos": 1084060,
"max_slice_time_in_nanos": 457165,
"min_slice_time_in_nanos": 433706,
"avg_slice_time_in_nanos": 443332,
"slice_count": 3
},
{
"name": "NonGlobalAggCollectorManager: [histo]",
"reason": "aggregation",
"time_in_nanos": 5366791,
"reduce_time_in_nanos": 4637260,
"max_slice_time_in_nanos": 4526680,
"min_slice_time_in_nanos": 4414049,
"avg_slice_time_in_nanos": 4487122,
"slice_count": 3
}
]
}
]
}
],
"aggregations": [
{
"type": "NumericHistogramAggregator",
"description": "histo",
"time_in_nanos": 16454372,
"max_slice_time_in_nanos": 7342096,
"min_slice_time_in_nanos": 4413728,
"avg_slice_time_in_nanos": 5430066,
"breakdown": {
"min_build_leaf_collector": 4320259,
"build_aggregation_count": 3,
"post_collection": 9942,
"max_collect_count": 2,
"initialize_count": 3,
"reduce_count": 0,
"avg_collect": 146319,
"max_build_aggregation": 2826399,
"avg_collect_count": 1,
"max_build_leaf_collector": 4322299,
"min_build_leaf_collector_count": 1,
"build_aggregation": 3038635,
"min_initialize": 1057,
"max_reduce": 0,
"build_leaf_collector_count": 3,
"avg_reduce": 0,
"min_collect_count": 1,
"avg_build_leaf_collector_count": 1,
"avg_build_leaf_collector": 4321197,
"max_collect": 181266,
"reduce": 0,
"avg_build_aggregation": 954896,
"min_post_collection": 1236,
"max_initialize": 11603,
"max_post_collection": 5350,
"collect_count": 5,
"avg_post_collection": 2793,
"avg_initialize": 4860,
"post_collection_count": 3,
"build_leaf_collector": 4322299,
"min_collect": 78519,
"min_build_aggregation": 8543,
"initialize": 11971068,
"max_build_leaf_collector_count": 1,
"min_reduce": 0,
"collect": 181838
},
"debug": {
"total_buckets": 1
}
}
]
}
]
}
}
```
</details>

### Modified or added response fields

The following sections contain definitions of all modified or added response fields for concurrent segment search.

#### The `query` array

|Field |Description |
|:--- |:--- |
|`time_in_nanos` |For concurrent segment search, `time_in_nanos` is the cumulative amount of time taken to run all methods across all slices, in nanoseconds. This is not equivalent to the actual amount of time the query took to run because it does not take into account that multiple slices can run the methods in parallel. |
|`breakdown.<method>` |For concurrent segment search, this field contains the total amount of time taken by all segments to run a method. |
|`breakdown.<method>_count` |For concurrent segment search, this field contains the total number of invocations of a `<method>` obtained by adding the number of method invocations for all segments. |

#### The `collector` array

|Field |Description |
|:--- |:--- |
|`time_in_nanos` |The total elapsed time for this collector, in nanoseconds. For concurrent segment search, `time_in_nanos` is the total amount of time across all slices (`max(slice_end_time) - min(slice_start_time)`). |
|`max_slice_time_in_nanos` |The maximum amount of time taken by any slice, in nanoseconds. |
|`min_slice_time_in_nanos` |The minimum amount of time taken by any slice, in nanoseconds. |
|`avg_slice_time_in_nanos` |The average amount of time taken by any slice, in nanoseconds. |
|`slice_count` |The total slice count for this query. |
|`reduce_time_in_nanos` |The amount of time taken to reduce results for all slice collectors, in nanoseconds. |

#### The `aggregations` array

|Field |Description |
|:--- |:--- |
|`time_in_nanos` |The total elapsed time for this aggregation, in nanoseconds. For concurrent segment search, `time_in_nanos` is the total amount of time across all slices (`max(slice_end_time) - min(slice_start_time)`). |
|`max_slice_time_in_nanos` |The maximum amount of time taken by any slice to run an aggregation, in nanoseconds. |
|`min_slice_time_in_nanos` |The minimum amount of time taken by any slice to run an aggregation, in nanoseconds. |
|`avg_slice_time_in_nanos` |The average amount of time taken by any slice to run an aggregation, in nanoseconds. |
|`<method>` |The total elapsed time across all slices (`max(slice_end_time) - min(slice_start_time)`). For example, for the `collect` method, it is the total time spent collecting documents into buckets across all slices. |
|`max_<method>` |The maximum amount of time taken by any slice to run an aggregation method. |
|`min_<method>`|The minimum amount of time taken by any slice to run an aggregation method. |
|`avg_<method>` |The average amount of time taken by any slice to run an aggregation method. |
|`<method>_count` |The total method count across all slices. For example, for the `collect` method, it is the total number of invocations of this method needed to collect documents into buckets across all slices. |
|`max_<method>_count` |The maximum number of invocations of a `<method>` on any slice. |
|`min_<method>_count` |The minimum number of invocations of a `<method>` on any slice. |
|`avg_<method>_count` |The average number of invocations of a `<method>` on any slice. |
Loading

0 comments on commit beb2182

Please sign in to comment.