[FEATURE] Export PPL-Calcite engine as reusable library

**Is your feature request related to a problem?**

Currently, the Calcite-based PPL engine lives within the OpenSearch SQL/PPL plugin and cannot be reused outside of that context. This limits its applicability in other environments, where the same parsing and logical planning capabilities would be valuable.

Use cases:

1. https://github.com/opensearch-project/opensearch-spark/issues/1136
2. https://github.com/opensearch-project/sql/issues/627
3. PPL CLI

**What solution would you like?**

### Granularity: Modular Publishing vs. Fat JAR

We considered two approaches for packaging and publishing the Calcite-based PPL engine for external use:

- **Option A: Modular Publishing**  
  - **Description:** Publish each internal module (`ppl`, `core`, `opensearch`, etc.) as an independent Maven artifact.  
  - **Pros:** Enables better reuse and flexibility—consumers (e.g., Spark) can depend only on the components they need.  
  - **Cons:** Requires publishing all relevant internal modules.

- **Option B: Fat JAR**  
  - **Description:** Bundle all internal modules and dependencies into a single artifact.  
  - **Pros:** Simplifies consumption (e.g., by the PPL CLI) with a single plug-and-play artifact.  
  - **Cons:** Tightly couples all components, increases artifact size, and reduces modularity.

<img width="1314" alt="Image" src="https://github.com/user-attachments/assets/87b01231-b5ae-4d0e-8b13-2d786dc7a794" />

### The New API Module

We propose introducing a new `api` module as a high-level integration layer for the Calcite-based engine and as primary entry point for external consumers.

- **Expose a unified API**: The current Calcite engine is tightly coupled with OpenSearch internals (e.g., `DataSourceService`, `Settings`), making reuse difficult. The new module will provide a clean, reusable interface for external consumers without exposing low-level implementation details.

- **Guaranteed Interoperability**: Integration tests will validate the contract between the Calcite engine and external consumers (e.g., Spark, CLI), ensuring correctness and long-term compatibility.

Tasks:

- [x] https://github.com/opensearch-project/sql/pull/3763
    - [x] **Add a Gradle task** to build a shaded (fat) JAR that bundles all required transitive dependencies for easy external consumption.
    - [x] **Publish the packaged library** to a Maven repository so it can be consumed by Spark or other tools.
- [x] https://github.com/opensearch-project/sql/pull/3783
    - [x] **Create a new submodule** that acts as a thin abstraction layer over existing Calcite engine internals to isolate and simplify external usage.
    - [x] **Define unified interface** for interacting with the Calcite engine with pluggable schema (OS or Spark catalog) and runtime (local or SparkSQL) support.
    - [x] **Add integration tests** for basic use cases to serve as a contract between the Calcite engine and downstream consumers such as Spark or the CLI.
- [x] https://github.com/opensearch-project/sql/pull/4723
    - [x] **Backport to 2.x branch** once the Calcite engine backport to 2.x is complete.

**What alternatives have you considered?**

- Using the full SQL/PPL plugin artifact directly: One option is to consume the existing OpenSearch SQL/PPL plugin artifact as-is. However, this approach brings in many unrelated dependencies—such as REST handlers, transport layers, and plugin wiring—which significantly increase the artifact’s size and complexity.

**Do you have any additional context?**

- Current Calcite code: https://github.com/opensearch-project/sql/blob/main/core/src/main/java/org/opensearch/sql/executor/QueryService.java#L99C17-L108C29
- Unified API being PoC
    - API: https://github.com/dai-chen/sql-1/blob/export-ppl-calcite-library-without-opensearch/unified-engine-api/src/main/java/org/opensearch/sql/unified/api/UnifiedQueryPlanner.java
    - Example usage: https://github.com/dai-chen/sql-1/blob/export-ppl-calcite-library-without-opensearch/integ-test/src/test/java/org/opensearch/sql/calcite/standalone/CalciteUnifiedEngineIT.java

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Export PPL-Calcite engine as reusable library #3734

Granularity: Modular Publishing vs. Fat JAR

The New API Module

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Export PPL-Calcite engine as reusable library #3734

Description

Granularity: Modular Publishing vs. Fat JAR

The New API Module

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions