[KYUUBI #3071][DOC] Add iceberg connector for Flink SQL Engine

apache · Jul 22, 2022 · bf9d158 · bf9d158
1 parent f1312ea
commit bf9d158
Show file tree

Hide file tree

Showing 2 changed files with 123 additions and 0 deletions.
diff --git a/docs/connector/flink/iceberg.rst b/docs/connector/flink/iceberg.rst
@@ -0,0 +1,121 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+..    http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+
+`Iceberg`_
+==========
+
+Apache Iceberg is an open table format for huge analytic datasets.
+Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and Impala
+using a high-performance table format that works just like a SQL table.
+
+.. tip::
+   This article assumes that you have mastered the basic knowledge and operation of `Iceberg`_.
+   For the knowledge about Iceberg not mentioned in this article,
+   you can obtain it from its `Official Documentation`_.
+
+By using kyuubi, we can run SQL queries towards Iceberg which is more
+convenient, easy to understand, and easy to expand than directly using
+flink to manipulate Iceberg.
+
+Iceberg Integration
+-------------------
+
+To enable the integration of kyuubi flink sql engine and Iceberg through Catalog APIs, you need to:
+
+- Referencing the Iceberg :ref:`dependencies`
+
+.. _dependencies:
+
+Dependencies
+************
+
+The **classpath** of kyuubi flink sql engine with Iceberg supported consists of
+
+1. kyuubi-flink-sql-engine-|release|.jar, the engine jar deployed with Kyuubi distributions
+2. a copy of flink distribution
+3. iceberg-flink-runtime-<flink.version>-<iceberg.version>.jar (example: iceberg-flink-runtime-1.14-0.14.0.jar), which can be found in the `Maven Central`_
+
+In order to make the Iceberg packages visible for the runtime classpath of engines, we can use one of these methods:
+
+1. Put the Iceberg packages into ``$FLINK_HOME/lib`` directly
+2. Set ``pipeline.jars=/path/to/iceberg-flink-runtime``
+
+.. warning::
+   Please mind the compatibility of different Iceberg and Flink versions, which can be confirmed on the page of `Iceberg multi engine support`_.
+
+Iceberg Operations
+------------------
+
+Taking ``CREATE CATALOG`` as a example,
+
+.. code-block:: sql
+
+   CREATE CATALOG hive_catalog WITH (
+     'type'='iceberg',
+     'catalog-type'='hive',
+     'uri'='thrift://localhost:9083',
+     'warehouse'='hdfs://nn:8020/warehouse/path'
+   );
+   USE CATALOG hive_catalog;
+
+Taking ``CREATE DATABASE`` as a example,
+
+.. code-block:: sql
+
+   CREATE DATABASE iceberg_db;
+   USE iceberg_db;
+
+Taking ``CREATE TABLE`` as a example,
+
+.. code-block:: sql
+
+   CREATE TABLE `hive_catalog`.`default`.`sample` (
+     id BIGINT COMMENT 'unique id',
+     data STRING
+   );
+
+Taking ``Batch Read`` as a example,
+
+.. code-block:: sql
+
+   SET execution.runtime-mode = batch;
+   SELECT * FROM sample;
+
+Taking ``Streaming Read`` as a example,
+
+.. code-block:: sql
+
+   SET execution.runtime-mode = streaming;
+   SELECT * FROM sample /*+ OPTIONS('streaming'='true', 'monitor-interval'='1s')*/ ;
+
+Taking ``INSERT INTO`` as a example,
+
+.. code-block:: sql
+
+   INSERT INTO `hive_catalog`.`default`.`sample` VALUES (1, 'a');
+   INSERT INTO `hive_catalog`.`default`.`sample` SELECT id, data from other_kafka_table;
+
+Taking ``INSERT OVERWRITE`` as a example,
+Flink streaming job does not support INSERT OVERWRITE.
+
+.. code-block:: sql
+
+   INSERT OVERWRITE `hive_catalog`.`default`.`sample` VALUES (1, 'a');
+   INSERT OVERWRITE `hive_catalog`.`default`.`sample` PARTITION(data='a') SELECT 6;
+
+.. _Iceberg: https://iceberg.apache.org/
+.. _Official Documentation: https://iceberg.apache.org/docs/latest/
+.. _Maven Central: https://mvnrepository.com/artifact/org.apache.iceberg
+.. _Iceberg multi engine support: https://iceberg.apache.org/multi-engine-support/
diff --git a/docs/connector/flink/index.rst b/docs/connector/flink/index.rst
@@ -18,3 +18,5 @@ Connectors For Flink SQL Query Engine
 
 .. toctree::
     :maxdepth: 2
+
+    iceberg
Original file line number	Diff line number	Diff line change
Expand Up		@@ -18,3 +18,5 @@ Connectors For Flink SQL Query Engine

		.. toctree::
		:maxdepth: 2

		iceberg