Skip to content

[FEATURE] Support case command pushdown to range agg #4201

@noCharger

Description

@noCharger

Is your feature request related to a problem?
Currently, PPL queries using CASE statements cannot be pushed down to OpenSearch's native aggregation framework. This leads to suboptimal performance as the query needs to:

  1. Fetch all documents from OpenSearch
  2. Process the CASE logic in memory
  3. Then perform aggregations

This is especially inefficient for large datasets where we could leverage OpenSearch's native range aggregations instead.

What solution would you like?
Implement push down optimization for PPL CASE statements, particularly when used with numeric ranges, to convert them into native OpenSearch range aggregations. For example, this PPL query:

source = custom-big5 
| eval range_bucket = case(
    `metrics.size` < -10, 'range_1', 
    `metrics.size` >= -10 and `metrics.size` < 10, 'range_2',
    `metrics.size` >= 10 and `metrics.size` < 100, 'range_3',
    `metrics.size` >= 100 and `metrics.size` < 1000, 'range_4',
    `metrics.size` >= 1000 and `metrics.size` < 2000, 'range_5',
    `metrics.size` >= 2000, 'range_6'
) 
| stats min(`metrics.tmin`) as tmin, avg(`metrics.size`) as tavg, max(`metrics.size`) as tmax 
    by range_bucket, span(`@timestamp`, 1h)

Should be optimized to use OpenSearch's native range aggregation similar to:

{
  "aggs": {
    "range_bucket": {
      "range": {
        "field": "metrics.size",
        "ranges": [
          { "to": -10 },
          { "from": -10, "to": 10 },
          { "from": 10, "to": 100 },
          { "from": 100, "to": 1000 },
          { "from": 1000, "to": 2000 },
          { "from": 2000 }
        ]
      }
    }
  }
}

This optimization would be particularly valuable for:

  • Time series analysis with range bucketing
  • Performance monitoring systems
  • Log analysis with severity levels
  • Any scenario involving numeric range categorization with aggregations

Metadata

Metadata

Assignees

Labels

PPLPiped processing languageenhancementNew feature or requestpushdownpushdown related issues

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions