generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 181
Open
Labels
PPLPiped processing languagePiped processing languagebugSomething isn't workingSomething isn't workingcalcitecalcite migration releatedcalcite migration releated
Description
What is the bug?
The reverse operation fails when applied to datasets larger than 10,000 rows.
When Calcite fallback is enabled:
- Datasets between 12,000 and 26,000 rows consistently fail with timeouts.
- Datasets above 26,000 rows trigger a fielddata circuit breaker due to excessive memory usage on the _id field.
When Calcite fallback is disabled:
- All datasets larger than 10,000 rows, including those above 26,000, fail with timeouts, not circuit breaker exceptions.
- The maximum row count that consistently succeeds under both modes is 10,000.
How can one reproduce the bug?
Steps to reproduce the behavior:
- Use a source with a large number of rows (e.g.,
source=big5) - Apply a
headoperation to limit the number of rows greater than 10,000 - Apply the
reverseoperation - Execute the query
Query: source=big5 | head 10000 | reverse
Iterations: 3, Timeout: 180s
Run 1: 993ms
Run 2: 766ms
Run 3: 766ms
Percentiles: P90=993ms | P95=993ms
Query: source=big5 | head 12000 | reverse
Run 1: FAILED (3982ms)
Run 2: FAILED (3933ms)
Run 3: FAILED (3944ms)
Query: source=big5 | head 30000 | reverse
Run 1: FAILED (14189ms)
Run 2: FAILED (3876ms)
Run 3: FAILED (3912ms)
What is the expected behavior?
The operation fails with a CircuitBreakingException when the fielddata for _id exceeds the configured breaker limit, when calcite fallback is enabled. When calcite fallback is disabled, the operation fails with a Timeout error.
CircuitBreakingException[[fielddata] Data too large, data for [_id] would be [12027099143/11.2gb], which is larger than the limit of [12025908428/11.1gb]]
Full error stack trace:
[2025-07-25T22:33:28,014][WARN ][o.o.i.b.fielddata ] [ip-172-31-35-163] [fielddata] New used memory 12027099143 [11.2gb] for data of [_id] would be larger than configured breaker: 12025908428 [11.1gb], breaking
[2025-07-25T22:33:28,015][ERROR][o.o.s.p.r.RestPPLQueryAction] [ip-172-31-35-163] Error happened during query handling
java.lang.RuntimeException: java.sql.SQLException: exception while executing query: all shards failed
at org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.lambda$execute$6(OpenSearchExecutionEngine.java:203) ~[?:?]
at java.base/java.security.AccessController.doPrivileged(AccessController.java:319) ~[?:?]
...
Caused by: org.opensearch.core.common.breaker.CircuitBreakingException: [fielddata] Data too large, data for [_id] would be [12027099143/11.2gb], which is larger than the limit of [12025908428/11.1gb]
at org.opensearch.common.breaker.ChildMemoryCircuitBreaker.circuitBreak(ChildMemoryCircuitBreaker.java:104) ~[opensearch-3.0.0.jar:3.0.0]
...
What is your host/environment?
- OpenSearch Version: 3.0.0
- Plugins: SQL, on Calcite Engine
- Java Version: Java 17+
- AWS EC2 instance, Ubuntu
- Memory Configuration: Circuit breaker limit configured at 11.1GB for fielddata
Do you have any additional context?
- The issue appears to be specifically related to the
_idfield's memory usage during the reverse operation - The threshold between success and failure is between 10,000 and 12,000 rows
- The circuit breaker is triggered very consistently at the same memory threshold (11.2GB vs 11.1GB limit)
- This may affect other operations that need to sort large datasets in reverse order
Metadata
Metadata
Assignees
Labels
PPLPiped processing languagePiped processing languagebugSomething isn't workingSomething isn't workingcalcitecalcite migration releatedcalcite migration releated
Type
Projects
Status
In progress
