Skip to content

Commit

Permalink
[KYUUBI #3067][DOC] Add Flink Table Store connector doc for Spark SQL…
Browse files Browse the repository at this point in the history
… Engine

### _Why are the changes needed?_

Add Flink Table Store connector doc for Spark SQL Engine

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #3151 from huage1994/3067.

Closes #3067

b2bc67d [guanhua.lgh] [KYUUBI #3067][DOC] Add Flink Table Store connector doc for Spark SQL Engine

Authored-by: guanhua.lgh <guanhua.lgh@alibaba-inc.com>
Signed-off-by: Kent Yao <yao@apache.org>
  • Loading branch information
huage1994 authored and yaooqinn committed Jul 27, 2022
1 parent 137e818 commit 91a2534
Show file tree
Hide file tree
Showing 2 changed files with 92 additions and 1 deletion.
90 changes: 90 additions & 0 deletions docs/connector/spark/flink_table_store.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
.. Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
`Flink Table Store`_
==========

Flink Table Store is a unified storage to build dynamic tables for both streaming and batch processing in Flink,
supporting high-speed data ingestion and timely data query.

.. tip::
This article assumes that you have mastered the basic knowledge and operation of `Flink Table Store`_.
For the knowledge about Flink Table Store not mentioned in this article,
you can obtain it from its `Official Documentation`_.

By using kyuubi, we can run SQL queries towards Flink Table Store which is more
convenient, easy to understand, and easy to expand than directly using
spark to manipulate Flink Table Store.

Flink Table Store Integration
-------------------

To enable the integration of kyuubi spark sql engine and Flink Table Store through
Apache Spark Datasource V2 and Catalog APIs, you need to:

- Referencing the Flink Table Store :ref:`dependencies`
- Setting the spark extension and catalog :ref:`configurations`

.. _dependencies:

Dependencies
************

The **classpath** of kyuubi spark sql engine with Flink Table Store supported consists of

1. kyuubi-spark-sql-engine-|release|.jar, the engine jar deployed with Kyuubi distributions
2. a copy of spark distribution
3. flink-table-store-spark-<version>.jar (example: flink-table-store-spark-0.2.jar), which can be found in the `Maven Central`_

In order to make the Flink Table Store packages visible for the runtime classpath of engines, we can use one of these methods:

1. Put the Flink Table Store packages into ``$SPARK_HOME/jars`` directly
2. Set ``spark.jars=/path/to/flink-table-store-spark``

.. warning::
Please mind the compatibility of different Flink Table Store and Spark versions, which can be confirmed on the page of `Flink Table Store multi engine support`_.

.. _configurations:

Configurations
**************

To activate functionality of Flink Table Store, we can set the following configurations:

.. code-block:: properties
spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog
spark.sql.catalog.tablestore.warehouse=file:/tmp/warehouse
Flink Table Store Operations
------------------

Flink Table Store supports reading table store tables through Spark.
A common scenario is to write data with Flink and read data with Spark.
You can follow this document `Flink Table Store Quick Start`_ to write data to a table store table
and then use kyuubi spark sql engine to query the table with the following SQL ``SELECT`` statement.


.. code-block:: sql
select * from table_store.default.word_count;
.. _Flink Table Store: https://flink.apache.org/
.. _Flink Table Store Quick Start: https://nightlies.apache.org/flink/flink-table-store-docs-master/docs/try-table-store/quick-start/
.. _Official Documentation: https://nightlies.apache.org/flink/flink-table-store-docs-master/
.. _Maven Central: https://mvnrepository.com/artifact/org.apache.flink
.. _Flink Table Store multi engine support: https://nightlies.apache.org/flink/flink-table-store-docs-master/docs/engines/overview/
3 changes: 2 additions & 1 deletion docs/connector/spark/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ By default, it provides accessibility to hive warehouses with various file forma
supported, such as parquet, orc, json, etc.

Also,it can easily integrate with other third-party libraries, such as Hudi,
Iceberg, Delta Lake, Kudu, HBase,Cassandra, etc.
Iceberg, Delta Lake, Kudu, Flink Table Store, HBase,Cassandra, etc.

We also provide sample data sources like TDC-DS, TPC-H for testing and benchmarking
purpose.
Expand All @@ -36,6 +36,7 @@ purpose.
hudi
iceberg
kudu
flink_table_store
tispark
tpcds
tpch

0 comments on commit 91a2534

Please sign in to comment.