[RFC] Support `multisearch` command for PPL

## Problem Statement

  Users need a first-class way to execute **multiple independent subsearches** and view the **combined results in strict time order**. Achieving this today requires manual composition (e.g., `append` + explicit `sort`) and careful command selection, which is error-prone, harder to optimize, and easy to
   misuse (e.g., mixing non-streaming operations that stall pipelines).

  ## Current State

  * PPL provides `append`, which concatenates results **sequentially** in the order subsearches are written.
  * `append` command can only support one single subsearch
  * There is **no built-in interleaving by timestamp**; users must add a `sort` as a separate step.

  ## Short-Term Goal

  **Current multisearch behavior:**
  - Uses **UNION ALL + ORDER BY @timestamp DESC** approach for result combination
  - Supports **all PPL commands** in subsearches (no artificial streaming restrictions)
  - Provides **timestamp-based interleaving** when @timestamp field is available
  - Falls back to **sequential concatenation** when timestamp field is missing

  ## Long-Term Goals

  * Provide a **first-class generating command** that:
    * Runs **two or more** subsearches and **globally interleaves** results by timestamp (DESC).
    * Implements **concurrent subsearch execution** similar to SPL's streaming architecture for improved performance.
    * Provides **real-time result merging** capabilities for live data scenarios.

  ## Current Implementation

  The `multisearch` command is now available with the following syntax:

  ```ppl
  source=<index> | multisearch [ <subsearch> ] [ <subsearch> ] ... [ <subsearch> ]

```

  Key semantics:
  - Position: Can be used after a source command (not necessarily first in pipeline).
  - Cardinality: Requires ≥ 2 subsearches.
  - Command support: All PPL commands are now supported in subsearches.
  - Result combination: UNION ALL of all subsearch outputs, followed by ORDER BY @timestamp DESC.
  - Time field: Uses @timestamp by default for ordering results.

  Examples

```
  -- Age group analysis with stats (now supported)
  source=accounts | multisearch
      [source=accounts | where age < 30 | eval age_group = 'young']
      [source=accounts | where age >= 30 | eval age_group = 'adult']
  | stats count by age_group
```

```
  -- Time interleaving across two indices
  source=logs | multisearch
      [source=service_logs | where category IN ('A', 'B')]
      [source=metrics_logs | where category IN ('E', 'F')]
  | head 5
```

```
  -- Complex aggregations in subsearches (now supported)
  source=data | multisearch
      [source=sales | stats sum(revenue) by region | eval type = 'sales']
      [source=costs | stats sum(expense) by region | eval type = 'costs']
  | sort region
```

  Next Steps for Long-Term Implementation

  - Concurrent Execution Architecture: Investigate implementing SPL-style concurrent subsearch execution using Calcite's parallel query capabilities.
  - Real-time Merging: Explore streaming result merging for live data scenarios where results arrive continuously.
  - Performance Optimization: Benchmark current UNION ALL + SORT approach vs potential concurrent execution benefits.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Support `multisearch` command for PPL #4348

Problem Statement

Current State

Short-Term Goal

Long-Term Goals

Current Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Support multisearch command for PPL #4348

Description

Problem Statement

Current State

Short-Term Goal

Long-Term Goals

Current Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC] Support `multisearch` command for PPL #4348