-
Notifications
You must be signed in to change notification settings - Fork 181
Description
Is your feature request related to a problem?
In the current implementation, we perform direct updates on AggregationBuilder when pushing down sort/limit operations into aggregations. However, AggregationBuilder doesn't support deep copying, which means a single instance is shared across different equivalent plans. Consequently, updates to the AggregationBuilder in one plan propagate to all equivalent plans.
Fortunately, since these update operations are idempotent, performing the same update multiple times on a single AggregationBuilder doesn't cause issues. On the other hand, when different update operations of the same type are applied, the later operation overrides the previous one made by another equivalent plan and get a eventually correct plan in the end.
However, in certain cases, update operations from an operator can be removed. For example, a sort operator before aggregation or dedup doesn't affect the collation of final results, and the planner may eliminate it in some equivalent plans while retaining it in others. When retained, it updates the shared AggregationBuilder for all plans. While this doesn't cause semantic correctness issues (since both preserving and removing collation are acceptable), it can lead to confusing inconsistencies between the PushDownContext and AggregationBuilder in the final plan.
What solution would you like?
We should implement lazy updates for AggregationBuilder, similar to our approach with OpenSearchRequestBuilder. Currently, this is blocked by our implementation of pushing down operators on aggregations, where we don't detect failures before updating the AggregationBuilder and instead rely on the exception-throwing mechanism during the update process.
What alternatives have you considered?
A clear and concise description of any alternative solutions or features you've considered.
Do you have any additional context?
Some bad cases spotted:
https://github.com/opensearch-project/sql/pull/4993/files/2a88ae61e0c3348961378d1b2e5d51fa0ac60238..58012e9f425beffc1fc0fa6a81e4d44ded805abb#r2647765328
Metadata
Metadata
Assignees
Labels
Type
Projects
Status