Skip to content

[FEATURE] Support profile API for PPL commands #4294

@noCharger

Description

@noCharger

Is your feature request related to a problem?

Currently, PPL queries lack the detailed performance profiling capabilities available in OpenSearch's native Profile API . While PPL has basic metrics collection through the stats API and query plan visualization through the explain endpoint, there's no equivalent to OpenSearch's detailed query profiling that provides timing breakdowns, shard-level execution details, and performance bottleneck identification for PPL commands.

This makes it difficult for users to:

  • Identify performance bottlenecks in complex PPL queries with multiple piped commands.
  • Understand timing breakdown for each PPL command in the pipeline (source, where, fields, stats, eval, etc.)
  • Optimize PPL queries based on detailed execution metrics similar to OpenSearch's native search profiling
  • Debug slow-performing PPL queries effectively with the same level of detail available for native OpenSearch queries

What solution would you like?

Implement a profile API for PPL commands similar to OpenSearch's . This should include:

  • PPL Profile Endpoint: A new REST endpoint (e.g., /_plugins/_ppl/_profile) that accepts PPL queries and returns detailed profiling information
  • Command-level Timing: Detailed timing breakdown for each PPL command in the pipeline, showing time_in_nanos and breakdown statistics for operations like filtering, aggregation, and data transformation
  • Shard-level Details: Per-shard profiling information showing how PPL commands are executed across different shards, including network timing and search execution details
  • Aggregation Profiling: When PPL queries use stats or other aggregation commands, provide the same detailed aggregation profiling available in OpenSearch's native API
  • Concurrent Segment Search Support: Include slice-level timing statistics (max/min/avg slice times) when concurrent segment search is enabled

What alternatives have you considered?

  • Enhanced Stats API: Extending the existing PPL stats endpoint to include more detailed timing information, but this would mix operational metrics with query-specific profiling
  • Enhanced Explain API: Adding timing information to the explain endpoint, but this would blur the distinction between query planning and execution profiling
  • Client-side Timing: Using the existing timing utilities in tests (PPLIntegTestCase.java), but this doesn't provide the granular, server-side profiling needed for production debugging

Do you have any additional context?

  • The current explain functionality shows logical and physical query plans but doesn't provide execution timing
  • OpenSearch's Profile API is a resource-consuming operation that adds overhead to search operations, so the same performance considerations should apply to PPL profiling
  • This feature would be particularly valuable for debugging complex PPL queries with multiple transformations, aggregations, and joins

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagefeature

Type

No type

Projects

Status

Not Started

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions