-
Notifications
You must be signed in to change notification settings - Fork 181
Description
What is the bug?
When no cursor requested (fetch_size is not specified), the SQL Legacy engine creates Point-in-Time (PIT) contexts unnecessarily. Meanwhile, the PIT is also not closed properly, causing PIT contexts to leak and accumulate until they expire after the keep-alive timeout (default: 1 minute). Under repeated or concurrent query workloads, this can exhaust PIT capacity and hit the default limit of 300 open PIT contexts, resulting in query failures.
How can one reproduce the bug?
- Create test index:
PUT /test-logs-2025-01-01
{
"mappings": {
"properties": {
"action": {"type": "text", "fields": {"keyword": {"type": "keyword"}}},
"timestamp": {"type": "date"}
}
}
}- Insert test data:
POST /test-logs-2025-01-01/_bulk
{"index":{}}
{"action":"login_success","timestamp":"2025-01-01T10:00:00Z"}
{"index":{}}
{"action":"login_success","timestamp":"2025-01-01T10:01:00Z"}
{"index":{}}
{"action":"login_failed","timestamp":"2025-01-01T10:02:00Z"}- Run SQL query (with backtick for V2 test, e.g.,
`test-logs-2025-01-01`):
POST /_plugins/_sql
{
"query": "SELECT * FROM test-logs-2025-01-01 WHERE action LIKE 'login%' ORDER BY timestamp ASC"
}- Check PIT contexts:
GET /_nodes/stats/indices/search?filter_path=nodes.*.indices.search.point_in_time_currentWhat is the expected behavior?
For queries that complete in a single page (no cursor requested / no follow-up page needed):
- Do not create PIT, or;
- If PIT is created internally, close/delete it before returning the response.
Note: The SQL/PPL V2 engine may also create PIT implicitly, but in that case it can be intentional: V2 may fetch until the last page and then close the PIT to ensure correctness for any required in-memory post-processing.
What is your host/environment?
- OpenSearch Version: 2.19-dev, main branch
- Plugin: SQL plugin
Do you have any screenshots?
N/A
Do you have any additional context?
- Legacy engine (SQL): always creates PIT aggressively when handling queries. The conditional check from 2.5 was removed.
- V2 Engine (SQL/PPL): PIT is created when
from + size > maxResultWindow. Request size isInteger.MAX_VALUEby default and size limit setting is not applied. The V2 engine closes PIT contexts by fetching all pages until an empty response is received, then deleting the PIT to ensure no PIT leak.- 2.5 code: https://github.com/opensearch-project/sql/blob/2.5/opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java#L99
- main code: https://github.com/opensearch-project/sql/blob/main/opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java#L63
- V3 Engine (Calcite - PPL only): doesn't have the issue because Calcite optimizer automatically applies a system size limit to every query. The system limit is read from a setting which is 10k by default.