Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Connector-V2] Support typesense connector #7450

Merged
merged 30 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
aa26fdb
Merge remote-tracking branch 'upstream/dev' into HEAD
zhangshenghang Aug 14, 2024
f4be624
[feature]typesense source first version submit
zhangshenghang Aug 15, 2024
6204569
[feature]improve typesense source
zhangshenghang Aug 16, 2024
e32c17d
[feature]init typesense sink
zhangshenghang Aug 16, 2024
001a00c
[feature]submit typesense sink
zhangshenghang Aug 16, 2024
d54ec84
[feature]submit typesense sink
zhangshenghang Aug 17, 2024
523f20c
[feature]add typesense e2e
zhangshenghang Aug 19, 2024
a9ea840
[feature]add e2e and header
zhangshenghang Aug 20, 2024
d1ad2bb
[improve]delete useless code
zhangshenghang Aug 20, 2024
c1d4f51
[feature]first version commit
zhangshenghang Aug 21, 2024
98fa2f3
[feature]fix some problem
zhangshenghang Aug 21, 2024
ec7e990
[improve]merge and delete useless code
zhangshenghang Aug 21, 2024
b1601e0
Merge branch 'apache:dev' into feature-support-typesense-connector
zhangshenghang Aug 21, 2024
bd22f00
[feature]improve style
zhangshenghang Aug 21, 2024
63c8852
[feature]fix some problem
zhangshenghang Aug 21, 2024
81f07a9
[feature]fix dead link
zhangshenghang Aug 21, 2024
72f5ca1
[feature]fix some problem
zhangshenghang Aug 21, 2024
600b15e
[feature]fix some problem
zhangshenghang Aug 21, 2024
ee411f8
[feature]fix some problem
zhangshenghang Aug 21, 2024
84124b4
[feature]delete useless code
zhangshenghang Aug 21, 2024
71b711b
[fixbug]fix some problem
zhangshenghang Aug 22, 2024
ec8531b
[fixbug]fix some problem
zhangshenghang Aug 22, 2024
956b04f
Merge branch 'apache:dev' into feature-support-typesense-connector
zhangshenghang Aug 22, 2024
7255437
[fixbug]fix some problem
zhangshenghang Aug 22, 2024
ee863d4
[fixbug]fix some problem
zhangshenghang Aug 22, 2024
a857372
[fixbug]fix some problem
zhangshenghang Aug 22, 2024
99aee47
[feature]fix some problem
zhangshenghang Aug 27, 2024
9b8b34a
Merge branch 'dev' into feature-support-typesense-connector
zhangshenghang Aug 28, 2024
2cd8f2e
Merge remote-tracking branch 'upstream/dev' into feature-support-type…
zhangshenghang Aug 28, 2024
f781588
[feature]add throw exception
zhangshenghang Aug 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/labeler/label-scope-conf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,12 @@ activemq:
- changed-files:
- any-glob-to-any-file: seatunnel-connectors-v2/connector-activemq/**
- all-globs-to-all-files: '!seatunnel-connectors-v2/connector-!(activemq)/**'
typesense:
- all:
- changed-files:
- any-glob-to-any-file: seatunnel-connectors-v2/connector-typesense/**
- all-globs-to-all-files: '!seatunnel-connectors-v2/connector-!(typesense)/**'

Zeta Rest API:
- changed-files:
- any-glob-to-any-file: seatunnel-engine/**/server/rest/**
Expand Down
2 changes: 1 addition & 1 deletion config/plugin_config
Original file line number Diff line number Diff line change
Expand Up @@ -88,5 +88,5 @@ connector-web3j
connector-milvus
connector-activemq
connector-sls
connector-typesense
connector-cdc-opengauss
--end--
93 changes: 93 additions & 0 deletions docs/en/connector-v2/sink/Typesense.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Typesense

## Description

Outputs data to `Typesense`.

## Key Features

- [ ] [Exactly Once](../../concept/connector-v2-features.md)
- [x] [CDC](../../concept/connector-v2-features.md)

## Options

| Name | Type | Required | Default Value |
|------------------|--------|----------|------------------------------|
| hosts | array | Yes | - |
| collection | string | Yes | - |
| schema_save_mode | string | Yes | CREATE_SCHEMA_WHEN_NOT_EXIST |
| data_save_mode | string | Yes | APPEND_DATA |
| primary_keys | array | No | |
| key_delimiter | string | No | `_` |
| api_key | string | No | |
| max_retry_count | int | No | 3 |
| max_batch_size | int | No | 10 |
| common-options | | No | - |

### hosts [array]

The access address for Typesense, formatted as `host:port`, e.g., `["typesense-01:8108"]`.

### collection [string]

The name of the collection to write to, e.g., "seatunnel".

### primary_keys [array]

Primary key fields used to generate the document `id`.

### key_delimiter [string]

Sets the delimiter for composite keys (default is `_`).

### api_key [config]

The `api_key` for secure access to Typesense.

### max_retry_count [int]

The maximum number of retry attempts for batch requests.

### max_batch_size [int]

The maximum size of document batches.

### common options

Common parameters for Sink plugins. Refer to [Common Sink Options](../source-common-options.md) for more details.

### schema_save_mode

Choose how to handle the target-side schema before starting the synchronization task:
- `RECREATE_SCHEMA`: Creates the table if it doesn’t exist, and deletes and recreates it if it does.
- `CREATE_SCHEMA_WHEN_NOT_EXIST`: Creates the table if it doesn’t exist, skips creation if it does.
- `ERROR_WHEN_SCHEMA_NOT_EXIST`: Throws an error if the table doesn’t exist.

### data_save_mode

Choose how to handle existing data on the target side before starting the synchronization task:
- `DROP_DATA`: Retains the database structure but deletes the data.
- `APPEND_DATA`: Retains both the database structure and the data.
- `ERROR_WHEN_DATA_EXISTS`: Throws an error if data exists.

## Example

Simple example:

```bash
sink {
Typesense {
source_table_name = "typesense_test_table"
hosts = ["localhost:8108"]
collection = "typesense_to_typesense_sink_with_query"
max_retry_count = 3
max_batch_size = 10
api_key = "xyz"
primary_keys = ["num_employees","id"]
key_delimiter = "="
schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST"
data_save_mode = "APPEND_DATA"
}
}
```

79 changes: 79 additions & 0 deletions docs/en/connector-v2/source/Typesense.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Typesense

> Typesense Source Connector

## Description

Reads data from Typesense.

## Key Features

- [x] [Batch Processing](../../concept/connector-v2-features.md)
- [ ] [Stream Processing](../../concept/connector-v2-features.md)
- [ ] [Exactly-Once](../../concept/connector-v2-features.md)
- [x] [Schema](../../concept/connector-v2-features.md)
- [x] [Parallelism](../../concept/connector-v2-features.md)
- [ ] [User-Defined Splits Support](../../concept/connector-v2-features.md)

## Options

| Name | Type | Required | Default |
|------------|--------|----------|---------|
| hosts | array | yes | - |
| collection | string | yes | - |
| schema | config | yes | - |
| api_key | string | no | - |
| query | string | no | - |
| batch_size | int | no | 100 |

### hosts [array]

The access address of Typesense, for example: `["typesense-01:8108"]`.

### collection [string]

The name of the collection to write to, for example: `"seatunnel"`.

### schema [config]

The columns to be read from Typesense. For more information, please refer to the [guide](../../concept/schema-feature.md#how-to-declare-type-supported).

### api_key [config]

The `api_key` for Typesense security authentication.

### batch_size

The number of records to query per batch when reading data.

### Common Options

For common parameters of Source plugins, please refer to [Source Common Options](../source-common-options.md).

## Example

```bash
source {
Typesense {
hosts = ["localhost:8108"]
collection = "companies"
api_key = "xyz"
query = "q=*&filter_by=num_employees:>9000"
schema = {
fields {
company_name_list = array<string>
company_name = string
num_employees = long
country = string
id = string
c_row = {
c_int = int
c_string = string
c_array_int = array<int>
}
}
}
}
}
```

95 changes: 95 additions & 0 deletions docs/zh/connector-v2/sink/Typesense.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Typesense

## 描述

输出数据到 `Typesense`

## 主要特性

- [ ] [精确一次](../../concept/connector-v2-features.md)
- [x] [cdc](../../concept/connector-v2-features.md)

## 选项

| 名称 | 类型 | 是否必须 | 默认值 |
|------------------|--------|------|------------------------------|
| hosts | array | 是 | - |
| collection | string | 是 | - |
| schema_save_mode | string | 是 | CREATE_SCHEMA_WHEN_NOT_EXIST |
| data_save_mode | string | 是 | APPEND_DATA |
| primary_keys | array | 否 | |
| key_delimiter | string | 否 | `_` |
| api_key | string | 否 | |
| max_retry_count | int | 否 | 3 |
| max_batch_size | int | 否 | 10 |
| common-options | | 否 | - |

### hosts [array]

Typesense的访问地址,格式为 `host:port`,例如:["typesense-01:8108"]

### collection [string]

要写入的集合名,例如:“seatunnel”

### primary_keys [array]

主键字段用于生成文档 `id`。

### key_delimiter [string]

设定复合键的分隔符(默认为 `_`)。

### api_key [config]

typesense 安全认证的 api_key。

### max_retry_count [int]

批次批量请求最大尝试大小

### max_batch_size [int]

批次批量文档最大大小

### common options

Sink插件常用参数,请参考 [Sink常用选项](../sink-common-options.md) 了解详情

### schema_save_mode

在启动同步任务之前,针对目标侧已有的表结构选择不同的处理方案<br/>
选项介绍:<br/>
`RECREATE_SCHEMA` :当表不存在时会创建,当表已存在时会删除并重建<br/>
`CREATE_SCHEMA_WHEN_NOT_EXIST` :当表不存在时会创建,当表已存在时则跳过创建<br/>
`ERROR_WHEN_SCHEMA_NOT_EXIST` :当表不存在时将抛出错误<br/>

### data_save_mode

在启动同步任务之前,针对目标侧已存在的数据选择不同的处理方案<br/>
选项介绍:<br/>
`DROP_DATA`: 保留数据库结构,删除数据<br/>
`APPEND_DATA`:保留数据库结构,保留数据<br/>
`ERROR_WHEN_DATA_EXISTS`:当有数据时抛出错误<br/>

## 示例

简单示例

```bash
sink {
Typesense {
source_table_name = "typesense_test_table"
hosts = ["localhost:8108"]
collection = "typesense_to_typesense_sink_with_query"
max_retry_count = 3
max_batch_size = 10
api_key = "xyz"
primary_keys = ["num_employees","id"]
key_delimiter = "="
schema_save_mode = "CREATE_SCHEMA_WHEN_NOT_EXIST"
data_save_mode = "APPEND_DATA"
}
}
```

79 changes: 79 additions & 0 deletions docs/zh/connector-v2/source/Typesense.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Typesense

> Typesense 源连接器

## 描述

从 Typesense 读取数据。

## 主要功能

- [x] [批处理](../../concept/connector-v2-features.md)
- [ ] [流处理](../../concept/connector-v2-features.md)
- [ ] [精确一次](../../concept/connector-v2-features.md)
- [x] [Schema](../../concept/connector-v2-features.md)
- [x] [并行度](../../concept/connector-v2-features.md)
- [ ] [支持用户定义的拆分](../../concept/connector-v2-features.md)

## 选项

| 名称 | 类型 | 必填 | 默认值 |
|------------|--------|----|-----|
| hosts | array | 是 | - |
| collection | string | 是 | - |
| schema | config | 是 | - |
| api_key | string | 否 | - |
| query | string | 否 | - |
| batch_size | int | 否 | 100 |

### hosts [array]

Typesense的访问地址,格式为 `host:port`,例如:["typesense-01:8108"]

### collection [string]

要写入的集合名,例如:“seatunnel”

### schema [config]

typesense 需要读取的列。有关更多信息,请参阅:[guide](../../concept/schema-feature.md#how-to-declare-type-supported)。

### api_key [config]

typesense 安全认证的 api_key。

### batch_size

读取数据时,每批次查询数量

### 常用选项

Source 插件常用参数,具体请参考 [Source 常用选项](../source-common-options.md)

## 示例

```bash
source {
Typesense {
hosts = ["localhost:8108"]
collection = "companies"
api_key = "xyz"
query = "q=*&filter_by=num_employees:>9000"
schema = {
fields {
company_name_list = array<string>
company_name = string
num_employees = long
country = string
id = string
c_row = {
c_int = int
c_string = string
c_array_int = array<int>
}
}
}
}
}
```

3 changes: 2 additions & 1 deletion plugin-mapping.properties
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,9 @@ seatunnel.source.Milvus = connector-milvus
seatunnel.sink.Milvus = connector-milvus
seatunnel.sink.ActiveMQ = connector-activemq
seatunnel.source.Sls = connector-sls
seatunnel.source.Typesense = connector-typesense
seatunnel.sink.Typesense = connector-typesense
seatunnel.source.Opengauss-CDC = connector-cdc-opengauss

seatunnel.transform.Sql = seatunnel-transforms-v2
seatunnel.transform.FieldMapper = seatunnel-transforms-v2
seatunnel.transform.Filter = seatunnel-transforms-v2
Expand Down
Loading
Loading