Skip to content

Commit 13c5185

Browse files
Doc: switch to use iceberg-aws-bundle jar (#1609)
1 parent 8339b67 commit 13c5185

File tree

9 files changed

+16
-18
lines changed

9 files changed

+16
-18
lines changed

getting-started/eclipselink/docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ services:
7676
retries: 15
7777
command: [
7878
/opt/spark/bin/spark-sql,
79-
--packages, "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,software.amazon.awssdk:bundle:2.28.17,software.amazon.awssdk:url-connection-client:2.28.17,org.apache.iceberg:iceberg-gcp-bundle:1.9.0,org.apache.iceberg:iceberg-azure-bundle:1.9.0",
79+
--packages, "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0,org.apache.iceberg:iceberg-gcp-bundle:1.9.0,org.apache.iceberg:iceberg-azure-bundle:1.9.0",
8080
--conf, "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions",
8181
--conf, "spark.sql.catalog.quickstart_catalog=org.apache.iceberg.spark.SparkCatalog",
8282
--conf, "spark.sql.catalog.quickstart_catalog.type=rest",

getting-started/jdbc/docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ services:
7676
retries: 15
7777
command: [
7878
/opt/spark/bin/spark-sql,
79-
--packages, "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,software.amazon.awssdk:bundle:2.28.17,software.amazon.awssdk:url-connection-client:2.28.17,org.apache.iceberg:iceberg-gcp-bundle:1.9.0,org.apache.iceberg:iceberg-azure-bundle:1.9.0",
79+
--packages, "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0,org.apache.iceberg:iceberg-gcp-bundle:1.9.0,org.apache.iceberg:iceberg-azure-bundle:1.9.0",
8080
--conf, "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions",
8181
--conf, "spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog",
8282
--conf, "spark.sql.catalog.polaris.type=rest",

getting-started/spark/notebooks/SparkPolaris.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,7 @@
256256
"\n",
257257
"spark = (SparkSession.builder\n",
258258
" .config(\"spark.sql.catalog.spark_catalog\", \"org.apache.iceberg.spark.SparkSessionCatalog\")\n",
259-
" .config(\"spark.jars.packages\", \"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.hadoop:hadoop-aws:3.4.0,software.amazon.awssdk:bundle:2.23.19,software.amazon.awssdk:url-connection-client:2.23.19\")\n",
259+
" .config(\"spark.jars.packages\", \"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0\")\n",
260260
" .config('spark.sql.iceberg.vectorization.enabled', 'false')\n",
261261
" \n",
262262
" # Configure the 'polaris' catalog as an Iceberg rest catalog\n",

plugins/spark/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ and depends on iceberg-spark-runtime 1.9.0.
3131
# Build Plugin Jar
3232
A task createPolarisSparkJar is added to build a jar for the Polaris Spark plugin, the jar is named as:
3333
`polaris-iceberg-<icebergVersion>-spark-runtime-<sparkVersion>_<scalaVersion>-<polarisVersion>.jar`. For example:
34-
`polaris-iceberg-1.8.1-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar`.
34+
`polaris-iceberg-1.9.0-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar`.
3535

3636
- `./gradlew :polaris-spark-3.5_2.12:createPolarisSparkJar` -- build jar for Spark 3.5 with Scala version 2.12.
3737
- `./gradlew :polaris-spark-3.5_2.13:createPolarisSparkJar` -- build jar for Spark 3.5 with Scala version 2.13.
@@ -53,7 +53,7 @@ jar, and to use the local Polaris server as a Catalog.
5353
```shell
5454
bin/spark-shell \
5555
--jars <path-to-spark-client-jar> \
56-
--packages org.apache.hadoop:hadoop-aws:3.4.0,io.delta:delta-spark_2.12:3.3.1 \
56+
--packages org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1 \
5757
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension \
5858
--conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog \
5959
--conf spark.sql.catalog.<catalog-name>.warehouse=<catalog-name> \
@@ -67,13 +67,13 @@ bin/spark-shell \
6767
```
6868

6969
Assume the path to the built Spark client jar is
70-
`/polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-iceberg-1.8.1-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar`
70+
`/polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-iceberg-1.9.0-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar`
7171
and the name of the catalog is `polaris`. The cli command will look like following:
7272

7373
```shell
7474
bin/spark-shell \
75-
--jars /polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-iceberg-1.8.1-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar \
76-
--packages org.apache.hadoop:hadoop-aws:3.4.0,io.delta:delta-spark_2.12:3.3.1 \
75+
--jars /polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-iceberg-1.9.0-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar \
76+
--packages org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1 \
7777
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension \
7878
--conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog \
7979
--conf spark.sql.catalog.polaris.warehouse=<catalog-name> \

plugins/spark/v3.5/getting-started/notebooks/SparkPolaris.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -266,8 +266,8 @@
266266
"from pyspark.sql import SparkSession\n",
267267
"\n",
268268
"spark = (SparkSession.builder\n",
269-
" .config(\"spark.jars\", \"../polaris_libs/polaris-iceberg-1.8.1-spark-runtime-3.5_2.12-0.11.0-beta-incubating-SNAPSHOT.jar\")\n",
270-
" .config(\"spark.jars.packages\", \"org.apache.hadoop:hadoop-aws:3.3.4,io.delta:delta-spark_2.12:3.2.1\")\n",
269+
" .config(\"spark.jars\", \"../polaris_libs/polaris-iceberg-1.9.0-spark-runtime-3.5_2.12-0.11.0-beta-incubating-SNAPSHOT.jar\")\n",
270+
" .config(\"spark.jars.packages\", \"org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.2.1\")\n",
271271
" .config(\"spark.sql.catalog.spark_catalog\", \"org.apache.spark.sql.delta.catalog.DeltaCatalog\")\n",
272272
" .config('spark.sql.iceberg.vectorization.enabled', 'false')\n",
273273
"\n",

regtests/setup.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ else
114114
cat << EOF >> ${SPARK_CONF}
115115
116116
# POLARIS_TESTCONF_V5
117-
spark.jars.packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:${ICEBERG_VERSION},org.apache.hadoop:hadoop-aws:3.4.0,software.amazon.awssdk:bundle:2.23.19,software.amazon.awssdk:url-connection-client:2.23.19
117+
spark.jars.packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:${ICEBERG_VERSION},org.apache.iceberg:iceberg-aws-bundle:${ICEBERG_VERSION}
118118
spark.hadoop.fs.s3.impl org.apache.hadoop.fs.s3a.S3AFileSystem
119119
spark.hadoop.fs.AbstractFileSystem.s3.impl org.apache.hadoop.fs.s3a.S3A
120120
spark.sql.variable.substitute true

regtests/t_pyspark/src/iceberg_spark.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,7 @@ def __enter__(self):
7575
"""
7676
packages = [
7777
"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0",
78-
"org.apache.hadoop:hadoop-aws:3.4.0",
79-
"software.amazon.awssdk:bundle:2.23.19",
80-
"software.amazon.awssdk:url-connection-client:2.23.19",
78+
"org.apache.iceberg:iceberg-aws-bundle:1.9.0",
8179
]
8280
excludes = ["org.checkerframework:checker-qual", "com.google.errorprone:error_prone_annotations"]
8381

site/content/in-dev/unreleased/getting-started/using-polaris.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,7 @@ _Note: the credentials provided here are those for our principal, not the root c
154154

155155
```shell
156156
bin/spark-sql \
157-
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.hadoop:hadoop-aws:3.4.0 \
157+
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0 \
158158
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
159159
--conf spark.sql.catalog.quickstart_catalog.warehouse=quickstart_catalog \
160160
--conf spark.sql.catalog.quickstart_catalog.header.X-Iceberg-Access-Delegation=vended-credentials \
@@ -170,7 +170,7 @@ bin/spark-sql \
170170

171171
Similar to the CLI commands above, this configures Spark to use the Polaris running at `localhost:8181`. If your Polaris server is running elsewhere, but sure to update the configuration appropriately.
172172

173-
Finally, note that we include the `hadoop-aws` package here. If your table is using a different filesystem, be sure to include the appropriate dependency.
173+
Finally, note that we include the `iceberg-aws-bundle` package here. If your table is using a different filesystem, be sure to include the appropriate dependency.
174174

175175
#### Using Spark SQL from a Docker container
176176

site/content/in-dev/unreleased/polaris-spark-client.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ a released Polaris Spark client.
6060

6161
```shell
6262
bin/spark-shell \
63-
--packages <polaris-spark-client-package>,org.apache.hadoop:hadoop-aws:3.4.0,io.delta:delta-spark_2.12:3.3.1 \
63+
--packages <polaris-spark-client-package>,org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1 \
6464
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension \
6565
--conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog \
6666
--conf spark.sql.catalog.<spark-catalog-name>.warehouse=<polaris-catalog-name> \
@@ -88,7 +88,7 @@ You can also start the connection by programmatically initialize a SparkSession,
8888
from pyspark.sql import SparkSession
8989

9090
spark = SparkSession.builder
91-
.config("spark.jars.packages", "<polaris-spark-client-package>,org.apache.hadoop:hadoop-aws:3.3.4,io.delta:delta-spark_2.12:3.3.1")
91+
.config("spark.jars.packages", "<polaris-spark-client-package>,org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1")
9292
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
9393
.config("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension")
9494
.config("spark.sql.catalog.<spark-catalog-name>", "org.apache.polaris.spark.SparkCatalog")

0 commit comments

Comments
 (0)