Skip to content

Commit

Permalink
Merge pull request #1307 from taosdata/other/TD-12525
Browse files Browse the repository at this point in the history
tdenginewrite rebuild
  • Loading branch information
TrafalgarLuo authored May 25, 2022
2 parents 53b2288 + 1a7a00c commit 9168f4c
Show file tree
Hide file tree
Showing 66 changed files with 4,257 additions and 1,541 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ DataX目前已经有了比较全面的插件体系,主流的RDBMS数据库、N
| | Elasticsearch | ||[](https://github.com/alibaba/DataX/blob/master/elasticsearchwriter/doc/elasticsearchwriter.md)|
| 时间序列数据库 | OpenTSDB || |[](https://github.com/alibaba/DataX/blob/master/opentsdbreader/doc/opentsdbreader.md)|
| | TSDB |||[](https://github.com/alibaba/DataX/blob/master/tsdbreader/doc/tsdbreader.md)[](https://github.com/alibaba/DataX/blob/master/tsdbwriter/doc/tsdbhttpwriter.md)|
| | TDengine |||[](https://github.com/taosdata/DataX/blob/master/tdenginereader/doc/tdenginereader.md)[](https://github.com/taosdata/DataX/blob/master/tdenginewriter/doc/tdenginewriter-CN.md)|

# 阿里云DataWorks数据集成

Expand Down
7 changes: 7 additions & 0 deletions package.xml
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,13 @@
</includes>
<outputDirectory>datax</outputDirectory>
</fileSet>
<fileSet>
<directory>tdenginereader/target/datax/</directory>
<includes>
<include>**/*.*</include>
</includes>
<outputDirectory>datax</outputDirectory>
</fileSet>

<!-- writer -->
<fileSet>
Expand Down
1 change: 1 addition & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@
<module>hbase20xsqlreader</module>
<module>hbase20xsqlwriter</module>
<module>kuduwriter</module>
<module>tdenginereader</module>
</modules>

<dependencyManagement>
Expand Down
195 changes: 195 additions & 0 deletions tdenginereader/doc/tdenginereader-CN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# DataX TDengineReader

## 1 快速介绍

TDengineReader 插件实现了 TDengine 读取数据的功能。

## 2 实现原理

TDengineReader 通过 TDengine 的 JDBC driver 查询获取数据。

## 3 功能说明

### 3.1 配置样例

* 配置一个从 TDengine 抽取数据作业:

```json
{
"job": {
"content": [
{
"reader": {
"name": "tdenginereader",
"parameter": {
"username": "root",
"password": "taosdata",
"connection": [
{
"table": [
"meters"
],
"jdbcUrl": [
"jdbc:TAOS-RS://192.168.56.105:6041/test?timestampFormat=TIMESTAMP"
]
}
],
"column": [
"ts",
"current",
"voltage",
"phase"
],
"where": "ts>=0",
"beginDateTime": "2017-07-14 10:40:00",
"endDateTime": "2017-08-14 10:40:00"
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": 1
}
}
}
}
```

* 配置一个自定义 SQL 的数据抽取作业:

```json
{
"job": {
"content": [
{
"reader": {
"name": "tdenginereader",
"parameter": {
"user": "root",
"password": "taosdata",
"connection": [
{
"querySql": [
"select * from test.meters"
],
"jdbcUrl": [
"jdbc:TAOS-RS://192.168.56.105:6041/test?timestampFormat=TIMESTAMP"
]
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": 1
}
}
}
}
```

### 3.2 参数说明

* **username**
* 描述:TDengine 实例的用户名 <br />
* 必选:是 <br />
* 默认值:无 <br />
* **password**
* 描述:TDengine 实例的密码 <br />
* 必选:是 <br />
* 默认值:无 <br />
* **jdbcUrl**
* 描述:TDengine 数据库的JDBC连接信息。注意,jdbcUrl必须包含在connection配置单元中。JdbcUrl具体请参看TDengine官方文档。
* 必选:是 <br />
* 默认值:无<br />
* **querySql**
* 描述:在有些业务场景下,where 这一配置项不足以描述所筛选的条件,用户可以通过该配置型来自定义筛选SQL。当用户配置了 querySql 后, TDengineReader 就会忽略 table, column,
where, beginDateTime, endDateTime这些配置型,直接使用这个配置项的内容对数据进行筛选。例如需要 进行多表join后同步数据,使用 select a,b from table_a join
table_b on table_a.id = table_b.id<br />
* 必选:否 <br />
* 默认值:无 <br />
* **table**
* 描述:所选取的需要同步的表。使用 JSON 的数组描述,因此支持多张表同时抽取。当配置为多张表时,用户自己需保证多张表是同一 schema 结构, TDengineReader不予检查表是否同一逻辑表。注意,table必须包含在
connection 配置单元中。<br />
* 必选:是 <br />
* 默认值:无 <br />
* **where**
* 描述:筛选条件中的 where 子句,TDengineReader 根据指定的column, table, where, begingDateTime, endDateTime 条件拼接 SQL,并根据这个 SQL
进行数据抽取。 <br />
* 必选:否 <br />
* 默认值:无 <br />
* **beginDateTime**
* 描述:数据的开始时间,Job 迁移从 begineDateTime 到 endDateTime 的数据,格式为 yyyy-MM-dd HH:mm:ss <br />
* 必选:否 <br />
* 默认值:无 <br />
* **endDateTime**
* 描述:数据的结束时间,Job 迁移从 begineDateTime 到 endDateTime 的数据,格式为 yyyy-MM-dd HH:mm:ss <br />
* 必选:否 <br />
* 默认值:无 <br />

### 3.3 类型转换

| TDengine 数据类型 | DataX 内部类型 |
| --------------- | ------------- |
| TINYINT | Long |
| SMALLINT | Long |
| INTEGER | Long |
| BIGINT | Long |
| FLOAT | Double |
| DOUBLE | Double |
| BOOLEAN | Bool |
| TIMESTAMP | Date |
| BINARY | Bytes |
| NCHAR | String |

## 4 性能报告

### 4.1 环境准备

#### 4.1.1 数据特征

#### 4.1.2 机器参数

#### 4.1.3 DataX jvm 参数

-Xms1024m -Xmx1024m -XX:+HeapDumpOnOutOfMemoryError

### 4.2 测试报告

#### 4.2.1 单表测试报告

| 通道数| DataX速度(Rec/s)|DataX流量(MB/s)| DataX机器网卡流出流量(MB/s)|DataX机器运行负载|DB网卡进入流量(MB/s)|DB运行负载|DB TPS|
|--------| --------|--------|--------|--------|--------|--------|--------|
|1| | | | | | | |
|4| | | | | | | |
|8| | | | | | | |
|16| | | | | | | |
|32| | | | | | | |

说明:

#### 4.2.4 性能测试小结

1.
2.

## 5 约束限制

## FAQ
135 changes: 135 additions & 0 deletions tdenginereader/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>datax-all</artifactId>
<groupId>com.alibaba.datax</groupId>
<version>0.0.1-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

<artifactId>tdenginereader</artifactId>

<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>

<dependencies>
<dependency>
<groupId>com.alibaba.datax</groupId>
<artifactId>datax-common</artifactId>
<version>${datax-project-version}</version>
<exclusions>
<exclusion>
<artifactId>slf4j-log4j12</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.78</version>
</dependency>

<dependency>
<groupId>com.alibaba.datax.tdenginewriter</groupId>
<artifactId>tdenginewriter</artifactId>
<version>0.0.1-SNAPSHOT</version>
<scope>compile</scope>
</dependency>

<dependency>
<groupId>com.taosdata.jdbc</groupId>
<artifactId>taos-jdbcdriver</artifactId>
<version>2.0.37</version>
<exclusions>
<exclusion>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit-version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.alibaba.datax</groupId>
<artifactId>plugin-rdbms-util</artifactId>
<version>0.0.1-SNAPSHOT</version>
<scope>compile</scope>
</dependency>

<dependency>
<groupId>com.alibaba.datax</groupId>
<artifactId>datax-core</artifactId>
<version>0.0.1-SNAPSHOT</version>
<scope>test</scope>
</dependency>
<!-- 添加 dm8 jdbc jar 包依赖-->
<!-- <dependency>-->
<!-- <groupId>com.dameng</groupId>-->
<!-- <artifactId>dm-jdbc</artifactId>-->
<!-- <version>1.8</version>-->
<!-- <scope>system</scope>-->
<!-- <systemPath>${project.basedir}/src/test/resources/DmJdbcDriver18.jar</systemPath>-->
<!-- </dependency>-->
</dependencies>

<build>
<plugins>
<!-- compiler plugin -->
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>${jdk-version}</source>
<target>${jdk-version}</target>
<encoding>${project-sourceEncoding}</encoding>
</configuration>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptors>
<descriptor>src/main/assembly/package.xml</descriptor>
</descriptors>
<finalName>datax</finalName>
</configuration>
<executions>
<execution>
<id>dwzip</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.12.4</version>
<configuration>
<!-- 包含哪些测试用例 -->
<includes>
<include>**/*Test.java</include>
</includes>
<!-- 不包含哪些测试用例 -->
<excludes>
</excludes>
<testFailureIgnore>true</testFailureIgnore>
</configuration>
</plugin>

</plugins>
</build>

</project>
34 changes: 34 additions & 0 deletions tdenginereader/src/main/assembly/package.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
<id></id>
<formats>
<format>dir</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<fileSet>
<directory>src/main/resources</directory>
<includes>
<include>plugin.json</include>
<include>plugin_job_template.json</include>
</includes>
<outputDirectory>plugin/reader/tdenginereader</outputDirectory>
</fileSet>
<fileSet>
<directory>target/</directory>
<includes>
<include>tdenginereader-0.0.1-SNAPSHOT.jar</include>
</includes>
<outputDirectory>plugin/reader/tdenginereader</outputDirectory>
</fileSet>
</fileSets>

<dependencySets>
<dependencySet>
<useProjectArtifact>false</useProjectArtifact>
<outputDirectory>plugin/reader/tdenginereader/libs</outputDirectory>
<scope>runtime</scope>
</dependencySet>
</dependencySets>
</assembly>
Loading

0 comments on commit 9168f4c

Please sign in to comment.