Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add docs for paimon format #2159

Merged
merged 2 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ Amoro meets diverse user needs by using different table formats. Currently, Amor
Iceberg format tables use the engine integration method provided by the Iceberg community.
For details, please refer to: [Iceberg Docs](https://iceberg.apache.org/docs/latest/).

### Paimon format

Paimon format tables use the engine integration method provided by the Paimon community.
For details, please refer to: [Paimon Docs](https://paimon.apache.org/docs/master/).

### Mixed format

Amoro support multiple processing engines for Mixed format as below:
Expand Down
6 changes: 6 additions & 0 deletions docs/engines/flink/flink-get-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ The Iceberg Format can be accessed using the Connector provided by Iceberg.
Refer to the documentation at [Iceberg Flink user manual](https://iceberg.apache.org/docs/latest/flink-connector/)
for more information.

## Paimon format

The Paimon Format can be accessed using the Connector provided by Paimon.
Refer to the documentation at [Paimon Flink user manual](https://paimon.apache.org/docs/master/engines/flink/)
for more information.

## Mixed format
The Apache Flink engine can process Amoro table data in batch and streaming mode. The Flink on Amoro connector provides the ability to read and write to the Amoro data lake while ensuring data consistency. To meet the high real-time data requirements of businesses, the Amoro data lake's underlying storage structure is designed with LogStore, which stores the latest changelog or append-only real-time data.

Expand Down
6 changes: 6 additions & 0 deletions docs/engines/spark/spark-get-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,12 @@ The Iceberg Format can be accessed using the Connector provided by Iceberg.
Refer to the documentation at [Iceberg Spark Connector](https://iceberg.apache.org/docs/latest/getting-started/)
for more information.

# Paimon Format

The Paimon Format can be accessed using the Connector provided by Paimon.
Refer to the documentation at [Paimon Spark Connector](https://paimon.apache.org/docs/master/engines/spark3/)
for more information.

# Mixed Format


Expand Down
6 changes: 5 additions & 1 deletion docs/engines/trino.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@ menu:

## Iceberg format
Iceberg format can be accessed using the Iceberg Connector provided by Trino.
please refer to the documentation at [Iceberg Connector](https://trino.io/docs/current/connector/iceberg.html#) for more information.
please refer to the documentation at [Iceberg Trino user manual](https://trino.io/docs/current/connector/iceberg.html#) for more information.

## Paimon format
Paimon format can be accessed using the Paimon Connector provided by Trino.
please refer to the documentation at [Paimon Trino user manual](https://paimon.apache.org/docs/master/engines/trino/) for more information.

## Mixed format
### Install
Expand Down
3 changes: 2 additions & 1 deletion docs/formats/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,5 @@ Currently, Amoro mainly provides the following three table formats:

- **Iceberg format:** Users can directly entrust their Iceberg tables to Amoro for maintenance, so that users can not only use all the functions of Iceberg tables, but also enjoy the performance and stability improvements brought by Amoro.
- **Mixed-Iceberg format:** Amoro provides a set of more optimized formats for streaming update scenarios on top of the Iceberg format. If users have high performance requirements for streaming updates or have demands for CDC incremental data reading functions, they can choose to use the Mixed-Iceberg format.
- **Mixed-Hive format:** Many users do not want to affect the business originally built on Hive while using data lakes. Therefore, Amoro provides the Mixed-Hive format, which can upgrade Hive tables to Mixed-Hive format only through metadata migration, and the original Hive tables can still be used normally. This ensures business stability and benefits from the advantages of data lake computing.
- **Mixed-Hive format:** Many users do not want to affect the business originally built on Hive while using data lakes. Therefore, Amoro provides the Mixed-Hive format, which can upgrade Hive tables to Mixed-Hive format only through metadata migration, and the original Hive tables can still be used normally. This ensures business stability and benefits from the advantages of data lake computing.
- **Paimon format:** Amoro supports displaying metadata information in the Paimon format, including Schema, Options, Files, Snapshots, DDLs, and Compaction information.
19 changes: 19 additions & 0 deletions docs/formats/paimon.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: "Paimon"
url: paimon-format
aliases:
- "formats/paimon"
menu:
main:
parent: Formats
weight: 200
---
# Paimon Format

Paimon format refers to [Apache Paimon](https://paimon.apache.org/) table.
Paimon is a streaming data lake platform with high-speed data ingestion, changelog tracking and efficient real-time analytics.

By registering Paimon's catalog with Amoro, users can view information such as Schema, Options, Files, Snapshots, DDLs, Compaction information, and more for Paimon tables.
Furthermore, they can operate on Paimon tables using Spark SQL in the Terminal. The current supported catalog types and file system types for Paimon are all supported.
For registering catalog operation steps, please refer to [Managing Catalogs](../managing-catalogs/).