Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting hive_conf_list, hive_var_list and sess_var_list for jdbcURL when connecting to HiveServer2 #33749

Merged
merged 1 commit into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
1. Build: Avoid using `-proc:full` when compiling ShardingSphere with OpenJDK23 - [#33681](https://github.com/apache/shardingsphere/pull/33681)
1. Doc: Adds documentation for HiveServer2 support - [#33717](https://github.com/apache/shardingsphere/pull/33717)
1. DistSQL: Check inline expression when create sharding table rule with inline sharding algorithm - [#33735](https://github.com/apache/shardingsphere/pull/33735)
1. Infra: Support setting `hive_conf_list`, `hive_var_list` and `sess_var_list` for jdbcURL when connecting to HiveServer2 - [#33749](https://github.com/apache/shardingsphere/pull/33749)

### Bug Fixes

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,14 +88,18 @@ ShardingSphere 对 HiveServer2 JDBC Driver 的支持位于可选模块中。

```yaml
services:
hive-server2:
image: apache/hive:4.0.1
environment:
SERVICE_NAME: hiveserver2
ports:
- "10000:10000"
expose:
- 10002
hive-server2:
image: apache/hive:4.0.1
environment:
SERVICE_NAME: hiveserver2
ports:
- "10000:10000"
expose:
- 10002
volumes:
- warehouse:/opt/hive/data/warehouse
volumes:
warehouse:
```

### 创建业务表
Expand Down Expand Up @@ -245,14 +249,6 @@ ShardingSphere JDBC DataSource 尚不支持执行 HiveServer2 的 `SET` 语句

用户应考虑为 ShardingSphere 提交包含单元测试的 PR。

### jdbcURL 限制

对于 ShardingSphere 的配置文件,对 HiveServer2 的 jdbcURL 存在限制。引入前提,
HiveServer2 的 jdbcURL 格式为 `jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list`。
ShardingSphere 当前对参数的解析仅支持以`jdbc:hive2://localhost:10000/demo_ds_1;initFile=/tmp/init.sql`为代表的`;hive_conf_list`部分。

若用户需使用`;sess_var_list`或`#hive_var_list`的 jdbcURL 参数,考虑为 ShardingSphere 提交包含单元测试的 PR。

### 在 ShardingSphere 数据源上使用 DML SQL 语句的前提条件

为了能够使用 `delete` 等 DML SQL 语句,当连接到 HiveServer2 时,用户应当考虑在 ShardingSphere JDBC 中仅使用支持 ACID 的表。
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,14 +90,18 @@ Write a Docker Compose file to start HiveServer2.

```yaml
services:
hive-server2:
image: apache/hive:4.0.1
environment:
SERVICE_NAME: hiveserver2
ports:
- "10000:10000"
expose:
- 10002
hive-server2:
image: apache/hive:4.0.1
environment:
SERVICE_NAME: hiveserver2
ports:
- "10000:10000"
expose:
- 10002
volumes:
- warehouse:/opt/hive/data/warehouse
volumes:
warehouse:
```

### Create business tables
Expand Down Expand Up @@ -250,16 +254,6 @@ ShardingSphere JDBC DataSource does not yet support executing HiveServer2's `SET

Users should consider submitting a PR containing unit tests for ShardingSphere.

### jdbcURL Restrictions

For ShardingSphere configuration files, there are restrictions on HiveServer2's jdbcURL. Introduction premise,
HiveServer2's jdbcURL format is `jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list`.

ShardingSphere currently only supports the `;hive_conf_list` part represented by `jdbc:hive2://localhost:10000/demo_ds_1;initFile=/tmp/init.sql`.

If users need to use the jdbcURL parameters of `;sess_var_list` or `#hive_var_list`,
consider submitting a PR containing unit tests for ShardingSphere.

### Prerequisites for using DML SQL statements on ShardingSphere data sources

In order to be able to use DML SQL statements such as `delete`,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,42 @@

package org.apache.shardingsphere.infra.database.hive.connector;

import lombok.SneakyThrows;
import org.apache.hive.jdbc.JdbcUriParseException;
import org.apache.hive.jdbc.Utils;
import org.apache.hive.jdbc.Utils.JdbcConnectionParams;
import org.apache.hive.jdbc.ZooKeeperHiveClientException;
import org.apache.shardingsphere.infra.database.core.connector.ConnectionProperties;
import org.apache.shardingsphere.infra.database.core.connector.ConnectionPropertiesParser;
import org.apache.shardingsphere.infra.database.core.connector.StandardConnectionProperties;
import org.apache.shardingsphere.infra.database.core.connector.url.JdbcUrl;
import org.apache.shardingsphere.infra.database.core.connector.url.StandardJdbcUrlParser;

import java.sql.SQLException;
import java.util.Properties;

/**
* Connection properties parser of Hive.
*/
public final class HiveConnectionPropertiesParser implements ConnectionPropertiesParser {

private static final int DEFAULT_PORT = 10000;

private static final String DEFAULT_HOSTNAME = "localhost";

@SneakyThrows({ZooKeeperHiveClientException.class, JdbcUriParseException.class, SQLException.class})
@Override
public ConnectionProperties parse(final String url, final String username, final String catalog) {
JdbcUrl jdbcUrl = new StandardJdbcUrlParser().parse(url);
return jdbcUrl.getHostname().isEmpty()
? new StandardConnectionProperties(DEFAULT_HOSTNAME, jdbcUrl.getPort(DEFAULT_PORT), jdbcUrl.getDatabase(), null, jdbcUrl.getQueryProperties(), new Properties())
: new StandardConnectionProperties(jdbcUrl.getHostname(), jdbcUrl.getPort(DEFAULT_PORT), jdbcUrl.getDatabase(), null, jdbcUrl.getQueryProperties(), new Properties());
JdbcConnectionParams params = Utils.parseURL(url, new Properties());
if (null == params.getHost() && 0 == params.getPort()) {
throw new RuntimeException("HiveServer2 in embedded mode has been deprecated by Apache Hive, "
+ "See https://issues.apache.org/jira/browse/HIVE-28418 . "
+ "Users should start local HiveServer2 through Docker Image https://hub.docker.com/r/apache/hive .");
}
Properties queryProperties = new Properties();
queryProperties.putAll(params.getSessionVars());
queryProperties.putAll(params.getHiveConfs());
queryProperties.putAll(params.getHiveVars());
return new StandardConnectionProperties(params.getHost(),
params.getPort(),
params.getDbName(),
null,
queryProperties,
new Properties());
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
import org.apache.hadoop.hive.metastore.api.FieldSchema;
import org.apache.hadoop.hive.metastore.api.GetTableRequest;
import org.apache.hadoop.hive.metastore.api.Table;
import org.apache.shardingsphere.infra.database.core.metadata.data.loader.DialectMetaDataLoader;
import org.apache.shardingsphere.infra.database.core.metadata.data.loader.MetaDataLoaderMaterial;
Expand Down Expand Up @@ -85,7 +86,8 @@ private Collection<TableMetaData> getTableMetaData(final Collection<String> tabl
Map<String, Integer> dataTypes = getDataType(material.getDataSource());
Collection<TableMetaData> result = new LinkedList<>();
for (String each : tables) {
result.add(new TableMetaData(each, getColumnMetaData(storeClient.getTable(material.getDefaultSchemaName(), each), dataTypes), Collections.emptyList(), Collections.emptyList()));
GetTableRequest req = new GetTableRequest(material.getDefaultSchemaName(), each);
result.add(new TableMetaData(each, getColumnMetaData(storeClient.getTable(req), dataTypes), Collections.emptyList(), Collections.emptyList()));
}
return result;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@

package org.apache.shardingsphere.infra.database.hive.connector;

import org.apache.hive.jdbc.JdbcUriParseException;
import org.apache.shardingsphere.infra.database.core.connector.ConnectionProperties;
import org.apache.shardingsphere.infra.database.core.connector.ConnectionPropertiesParser;
import org.apache.shardingsphere.infra.database.core.exception.UnrecognizedDatabaseURLException;
import org.apache.shardingsphere.infra.database.core.spi.DatabaseTypedSPILoader;
import org.apache.shardingsphere.infra.database.core.type.DatabaseType;
import org.apache.shardingsphere.infra.spi.type.typed.TypedSPILoader;
import org.apache.shardingsphere.test.util.PropertiesBuilder;
import org.apache.shardingsphere.test.util.PropertiesBuilder.Property;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtensionContext;
import org.junit.jupiter.params.ParameterizedTest;
Expand Down Expand Up @@ -55,17 +56,26 @@ void assertNewConstructor(final String name, final String url, final String host

@Test
void assertNewConstructorFailure() {
assertThrows(UnrecognizedDatabaseURLException.class, () -> parser.parse("jdbc:hive2:xxxxxxxx", null, null));
assertThrows(JdbcUriParseException.class, () -> parser.parse("jdbc:hive2://localhost:10000;principal=test", null, null));
assertThrows(JdbcUriParseException.class, () -> parser.parse("jdbc:hive2://localhost:10000;principal=hive/HiveServer2Host@YOUR-REALM.COM", null, null));
assertThrows(JdbcUriParseException.class, () -> parser.parse("jdbc:hive2://localhost:10000test", null, null));
assertThrows(RuntimeException.class, () -> parser.parse("jdbc:hive2://", null, null));
}

private static class NewConstructorTestCaseArgumentsProvider implements ArgumentsProvider {

@Override
public Stream<? extends Arguments> provideArguments(final ExtensionContext extensionContext) {
return Stream.of(
Arguments.of("simple", "jdbc:hive2:///foo_ds", "localhost", 10000, "foo_ds", null, new Properties()),
Arguments.of("complex", "jdbc:hive2://127.0.0.1:9999/foo_ds?transportMode=http", "127.0.0.1", 9999, "foo_ds", null,
PropertiesBuilder.build(new PropertiesBuilder.Property("transportMode", "http"))));
Arguments.of("simple_first", "jdbc:hive2://localhost:10001/default", "localhost", 10001, "default", null, new Properties()),
Arguments.of("simple_second", "jdbc:hive2://localhost/notdefault", "localhost", 10000, "notdefault", null, new Properties()),
Arguments.of("simple_third", "jdbc:hive2://foo:1243", "foo", 1243, "default", null, new Properties()),
Arguments.of("complex", "jdbc:hive2://server:10002/db;user=foo;password=bar?transportMode=http;httpPath=hs2",
"server", 10002, "db", null, PropertiesBuilder.build(
new Property("user", "foo"),
new Property("password", "bar"),
new Property("transportMode", "http"),
new Property("httpPath", "hs2"))));
}
}
}