Skip to content

Commit 021da73

Browse files
committed
fixed readme
1 parent bd1e47a commit 021da73

File tree

3 files changed

+58
-33
lines changed

3 files changed

+58
-33
lines changed

metastore/src/java/org/apache/hadoop/hive/metastore/HiveProtoEventsCleanerTask.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@
3333
import java.security.PrivilegedExceptionAction;
3434
import java.time.Instant;
3535
import java.time.LocalDate;
36-
import java.time.LocalDateTime;
3736
import java.time.ZoneOffset;
3837
import java.time.format.DateTimeFormatter;
3938
import java.util.ArrayList;

packaging/src/docker/README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,3 +306,27 @@ docker compose exec hiveserver2-standalone /bin/bash
306306
/opt/hive/bin/schematool -initSchema -dbType hive -metaDbType postgres -url jdbc:hive2://localhost:10000/default
307307
exit
308308
```
309+
310+
#### Hive with S3-backed warehouse storage
311+
312+
1. Download the AWS SDK bundle and place it under jars/ directory.
313+
314+
**Disclaimer:**
315+
Hadoop **3.4.1** requires **AWS SDK v2**.
316+
```shell
317+
wget https://repo1.maven.org/maven2/software/amazon/awssdk/bundle/2.26.19/bundle-2.26.19.jar -P jars/
318+
```
319+
320+
2. Set the following environment variables:
321+
- AWS_ACCESS_KEY_ID
322+
- AWS_SECRET_ACCESS_KEY
323+
- DEFAULT_FS
324+
- HIVE_WAREHOUSE_PATH
325+
- S3_ENDPOINT_URL
326+
327+
```shell
328+
DEFAULT_FS="s3a://dw-team-bucket" \
329+
HIVE_WAREHOUSE_PATH="/data/warehouse/tablespace/managed/hive" \
330+
S3_ENDPOINT_URL="s3.us-west-2.amazonaws.com" \
331+
docker-compose up
332+
```

standalone-metastore/packaging/src/docker/README.md

Lines changed: 34 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -84,30 +84,30 @@ or assuming that you're relying on current `project.version` from pom.xml,
8484
```shell
8585
export HIVE_VERSION=$(mvn -f pom.xml -q help:evaluate -Dexpression=project.version -DforceStdout)
8686
```
87-
- Metastore
87+
#### Metastore
8888

8989
For a quick start, launch the Metastore with Derby,
90-
```shell
91-
docker run -d -p 9083:9083 --name metastore-standalone apache/hive:standalone-metastore-${HIVE_VERSION}
92-
```
93-
Everything would be lost when the service is down. In order to save the Hive table's schema and data, start the container with an external Postgres and Volume to keep them,
94-
95-
```shell
96-
docker run -d -p 9083:9083 --env DB_DRIVER=postgres \
97-
--env SERVICE_OPTS="-Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://postgres:5432/metastore_db -Djavax.jdo.option.ConnectionUserName=hive -Djavax.jdo.option.ConnectionPassword=password" \
98-
--mount source=warehouse,target=/opt/hive/data/warehouse \
99-
--mount type=bind,source=`mvn help:evaluate -Dexpression=settings.localRepository -q -DforceStdout`/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar,target=/opt/hive/lib/postgres.jar \
100-
--name metastore-standalone apache/hive:standalone-metastore-${HIVE_VERSION}
101-
```
102-
103-
If you want to use your own `hdfs-site.xml` for the service, you can provide the environment variable `HIVE_CUSTOM_CONF_DIR` for the command. For instance, put the custom configuration file under the directory `/opt/hive/conf`, then run,
104-
105-
```shell
106-
docker run -d -p 9083:9083 --env DB_DRIVER=postgres \
107-
-v /opt/hive/conf:/hive_custom_conf --env HIVE_CUSTOM_CONF_DIR=/hive_custom_conf \
108-
--mount type=bind,source=`mvn help:evaluate -Dexpression=settings.localRepository -q -DforceStdout`/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar,target=/opt/hive/lib/postgres.jar \
109-
--name metastore apache/hive:standalone-metastore-${HIVE_VERSION}
110-
```
90+
```shell
91+
docker run -d -p 9083:9083 --name metastore-standalone apache/hive:standalone-metastore-${HIVE_VERSION}
92+
```
93+
Everything would be lost when the service is down. In order to save the Hive table's schema and data, start the container with an external Postgres and Volume to keep them,
94+
95+
```shell
96+
docker run -d -p 9083:9083 --env DB_DRIVER=postgres \
97+
--env SERVICE_OPTS="-Djavax.jdo.option.ConnectionDriverName=org.postgresql.Driver -Djavax.jdo.option.ConnectionURL=jdbc:postgresql://postgres:5432/metastore_db -Djavax.jdo.option.ConnectionUserName=hive -Djavax.jdo.option.ConnectionPassword=password" \
98+
--mount source=warehouse,target=/opt/hive/data/warehouse \
99+
--mount type=bind,source=`mvn help:evaluate -Dexpression=settings.localRepository -q -DforceStdout`/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar,target=/opt/hive/lib/postgres.jar \
100+
--name metastore-standalone apache/hive:standalone-metastore-${HIVE_VERSION}
101+
```
102+
103+
If you want to use your own `hdfs-site.xml` for the service, you can provide the environment variable `HIVE_CUSTOM_CONF_DIR` for the command. For instance, put the custom configuration file under the directory `/opt/hive/conf`, then run,
104+
105+
```shell
106+
docker run -d -p 9083:9083 --env DB_DRIVER=postgres \
107+
-v /opt/hive/conf:/hive_custom_conf --env HIVE_CUSTOM_CONF_DIR=/hive_custom_conf \
108+
--mount type=bind,source=`mvn help:evaluate -Dexpression=settings.localRepository -q -DforceStdout`/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar,target=/opt/hive/lib/postgres.jar \
109+
--name metastore apache/hive:standalone-metastore-${HIVE_VERSION}
110+
```
111111

112112
NOTE:
113113

@@ -116,7 +116,7 @@ then add "--env SCHEMA_COMMAND=upgradeSchema" to the command.
116116

117117
2) If the full Acid support (Compaction) is needed, use the Hive docker image to bring up the container.
118118

119-
- Metastore with Postgres
119+
#### Metastore with Postgres
120120

121121
To spin up Metastore with a remote DB, there is a `docker-compose.yml` placed under `packaging/src/docker` for this purpose,
122122
specify the `POSTGRES_LOCAL_PATH` first:
@@ -131,19 +131,21 @@ export POSTGRES_LOCAL_PATH=`mvn help:evaluate -Dexpression=settings.localReposit
131131
If you don't install maven or have problem in resolving the postgres driver, you can always download this jar yourself,
132132
change the `POSTGRES_LOCAL_PATH` to the path of the downloaded jar.
133133

134-
- Metastore with S3 support
134+
#### Metastore with S3-backed warehouse storage
135135

136-
Download AWS SDK bundle and place it under the jars directory:
136+
1. Download the AWS SDK bundle and place it under jars/ directory.
137137

138-
Note: Hadoop 3.4.1 requires both AWS SDK V1 and V2 for its S3A connector:
139-
wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.12.770/aws-java-sdk-bundle-1.12.770.jar -P jars/
138+
**Disclaimer:**
139+
Hadoop **3.4.1** requires **AWS SDK v2**.
140+
```shell
140141
wget https://repo1.maven.org/maven2/software/amazon/awssdk/bundle/2.26.19/bundle-2.26.19.jar -P jars/
142+
```
141143

142-
Add the following ENV variables:
143-
- AWS_ACCESS_KEY_ID
144-
- AWS_SECRET_ACCESS_KEY
145-
- DEFAULT_FS
146-
- HIVE_WAREHOUSE_PATH
144+
2. Set the following environment variables:
145+
- AWS_ACCESS_KEY_ID
146+
- AWS_SECRET_ACCESS_KEY
147+
- DEFAULT_FS
148+
- HIVE_WAREHOUSE_PATH
147149
- S3_ENDPOINT_URL
148150

149151
Then,

0 commit comments

Comments
 (0)