-
Notifications
You must be signed in to change notification settings - Fork 333
Add Docker-based Ceph + Polaris cluster setup #3022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Just out of curiosity, @sharas2050 do you know how much (or better: how many containers) would be minimally needed to run Ceph with STS/IAM? |
It is the same number as using it without STS. It is matter of Ceph configuration and bucket management. I wrote a small article earlier this year about that. You can read it here. Even in this example You can combine MON+MGR in a single container. |
dimas-b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution, @sharas2050 ! The LGTM overall, just a minor comments about .env.
| This guide describes how to spin up a **single-node Ceph cluster** with **RADOS Gateway (RGW)** for S3-compatible storage and configure it for use by **Polaris**. | ||
|
|
||
| This example cluster is configured for basic access key authentication only. | ||
| It does not include STS (Security Token Service) or temporary credentials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a getting-started with IAM/STS as a follow-up of this PR?
|
@snazy I just updated PR with your suggested changes |
Let me try it locally. |
dimas-b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, mon1 fails in my env. (Linux + Podman)... logs:
+ sudo -u ceph ceph-mon --mkfs -i mon1 --monmap /var/lib/ceph/tmp/monmap --keyring /var/lib/ceph/tmp/ceph.mon.keyring
sudo: PAM account management error: Authentication service cannot retrieve authentication info
sudo: a password is required
| spark-sql ()> create namespace ns; | ||
| Time taken: 0.374 seconds | ||
| spark-sql ()> create table ns.t1 as select 'abc'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one fails with
Credential vending was requested for table ns.t1, but no credentials are available
java.lang.IllegalArgumentException: Credential vending was requested for table ns.t1, but no credentials are available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a stack trace, @snazy ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, I'll run locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duh - I copied it, but never pasted it 🤦
But it was about credential vending. So likely an Iceberg config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but in this case STS is not available, so credential vending should be be involved 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According logs you are working on some already existing namespace. Indicates already existing Polaris instance/catalog
spark-sql ()> create namespace ns;
[SCHEMA_ALREADY_EXISTS] Cannot create schema `ns` because it already exists.
Choose a different name, drop the existing schema, or add the IF NOT EXISTS clause to tolerate pre-existing schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's got nothing to do with the credential-vending error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the setup isn’t working on your end, while it’s running fine for me with different Spark versions.
Could you please share the log files from the polaris-setup and polaris containers so I can investigate the issue more accurately?
That would really help me understand what’s going wrong in your environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ docker container logs ceph-polaris-setup-1
fetch https://dl-cdn.alpinelinux.org/alpine/v3.22/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.22/community/x86_64/APKINDEX.tar.gz
(1/2) Installing oniguruma (6.9.10-r0)
(2/2) Installing jq (1.8.0-r0)
Executing busybox-1.37.0-r19.trigger
OK: 13 MiB in 27 packages
Creating catalog...
fetch https://dl-cdn.alpinelinux.org/alpine/v3.22/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.22/community/x86_64/APKINDEX.tar.gz
OK: 13 MiB in 27 packages
Obtained access token: eyJhbG...
STORAGE_LOCATION is set to 's3://polaris-storage'
Using StorageType: S3
Creating a catalog named quickstart_catalog in realm POLARIS...
{ "catalog": { "name": "quickstart_catalog", "type": "INTERNAL", "readOnly": false, "properties": { "default-base-location": "s3://polaris-storage" }, "storageConfigInfo": {"storageType":"S3", "endpoint":"http://rgw1:7480", "stsUnavailable":"true", "pathStyleAccess":true} } }
* Host polaris:8181 was resolved.
* IPv6: (none)
* IPv4: 10.89.0.7
* Trying 10.89.0.7:8181...
* Connected to polaris (10.89.0.7) port 8181
* using HTTP/1.x
> POST /api/management/v1/catalogs HTTP/1.1
> Host: polaris:8181
> User-Agent: curl/8.14.1
> Authorization: Bearer eyJhbGciO...
> Accept: application/json
> Content-Type: application/json
> Polaris-Realm: POLARIS
> Content-Length: 326
>
} [326 bytes data]
* upload completely sent off: 326 bytes
< HTTP/1.1 201 Created
< Content-Type: application/json;charset=UTF-8
< content-length: 343
< Polaris-Request-Id: b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000002
<
{ [343 bytes data]
* Connection #0 to host polaris left intact
{"type":"INTERNAL","name":"quickstart_catalog","properties":{"default-base-location":"s3://polaris-storage"},"createTimestamp":1763031212314,"lastUpdateTimestamp":0,"entityVersion":1,"storageConfigInfo":{"endpoint":"http://rgw1:7480","stsUnavailable":true,"pathStyleAccess":true,"storageType":"S3","allowedLocations":["s3://polaris-storage"]}}
Done.
Extra grants...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 56 0 0 100 56 0 2524 --:--:-- --:--:-- --:--:-- 2545
Done.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 370 100 370 0 0 46464 0 --:--:-- --:--:-- --:--:-- 52857
{"catalogs":[{"type":"INTERNAL","name":"quickstart_catalog","properties":{"default-base-location":"s3://polaris-storage"},"createTimestamp":1763031212314,"lastUpdateTimestamp":1763031212314,"entityVersion":1,"storageConfigInfo":{"endpoint":"http://rgw1:7480","stsUnavailable":true,"pathStyleAccess":true,"storageType":"S3","allowedLocations":["s3://polaris-storage"]}}]}
$ docker container logs ceph-polaris-1
INFO exec -a "java" java -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005 -XX:MaxRAMPercentage=80.0 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+ExitOnOutOfMemoryError -cp "." -jar /deployments/quarkus-run.jar
INFO running in /deployments
Listening for transport dt_socket at address: 5005
...
Powered by Quarkus 3.28.2
2025-11-13 10:53:26,710 WARN [io.qua.config] [,] [,,,] (main) The "quarkus.log.file.enable" config property is deprecated and should not be used anymore.
2025-11-13 10:53:26,711 WARN [io.qua.config] [,] [,,,] (main) The "quarkus.log.console.enable" config property is deprecated and should not be used anymore.
2025-11-13 10:53:28,092 INFO [org.apa.pol.ser.con.ServiceProducers] [,] [,,,] (main) Bootstrapping realm(s) 'POLARIS', if necessary, from root credentials set provided via the environment variable POLARIS_BOOTSTRAP_CREDENTIALS or Java system property polaris.bootstrap.credentials ...
2025-11-13 10:53:28,171 INFO [org.apa.pol.ser.con.ServiceProducers] [,] [,,,] (main) Realm 'POLARIS' automatically bootstrapped, credentials taken from root credentials set provided via the environment variable POLARIS_BOOTSTRAP_CREDENTIALS or Java system property polaris.bootstrap.credentials, not printed to stdout.
2025-11-13 10:53:28,183 WARN [org.apa.pol.ser.con.ProductionReadinessChecks] [,] [,,,] (main) ⚠️ Production readiness checks failed! Check the warnings below.
2025-11-13 10:53:28,183 WARN [org.apa.pol.ser.con.ProductionReadinessChecks] [,] [,,,] (main) - ⚠️ The current metastore is intended for tests only. Offending configuration option: 'polaris.persistence.type'.
2025-11-13 10:53:28,183 WARN [org.apa.pol.ser.con.ProductionReadinessChecks] [,] [,,,] (main) - ⚠️ A public key file wasn't provided and will be generated. Offending configuration option: 'polaris.authentication.token-broker.rsa-key-pair.public-key-file'.
2025-11-13 10:53:28,184 WARN [org.apa.pol.ser.con.ProductionReadinessChecks] [,] [,,,] (main) - ⚠️ A private key file wasn't provided and will be generated. Offending configuration option: 'polaris.authentication.token-broker.rsa-key-pair.private-key-file'.
2025-11-13 10:53:28,184 WARN [org.apa.pol.ser.con.ProductionReadinessChecks] [,] [,,,] (main) - ⚠️ The realm context resolver is configured to map requests without a realm header to the default realm. Offending configuration option: 'polaris.realm-context.require-header'.
2025-11-13 10:53:28,184 WARN [org.apa.pol.ser.con.ProductionReadinessChecks] [,] [,,,] (main) Refer to https://polaris.apache.org/in-dev/unreleased/configuring-polaris-for-production for more information.
2025-11-13 10:53:28,237 INFO [io.quarkus] [,] [,,,] (main) Apache Polaris Server (incubating) 1.2.0-incubating on JVM (powered by Quarkus 3.28.2) started in 2.803s. Listening on: http://0.0.0.0:8181. Management interface listening on http://0.0.0.0:8182.
2025-11-13 10:53:28,238 INFO [io.quarkus] [,] [,,,] (main) Profile prod activated.
2025-11-13 10:53:28,238 INFO [io.quarkus] [,] [,,,] (main) Installed features: [agroal, amazon-sdk-rds, cdi, hibernate-validator, jdbc-postgresql, micrometer, narayana-jta, oidc, opentelemetry, reactive-routes, rest, rest-jackson, security, smallrye-context-propagation, smallrye-fault-tolerance, smallrye-health, vertx]
2025-11-13 10:53:31,860 INFO [org.apa.pol.ser.con.PolarisIcebergObjectMapperCustomizer] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000001,POLARIS] [,,,] (executor-thread-1) Limiting request body size to 10485760 bytes
2025-11-13 10:53:31,883 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000001,POLARIS] [,,,] (executor-thread-1) 10.89.0.8 - - [13/Nov/2025:10:53:31 +0000] "POST /api/catalog/v1/oauth/tokens HTTP/1.1" 200 757
2025-11-13 10:53:32,318 INFO [org.apa.pol.ser.adm.PolarisServiceImpl] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000002,POLARIS] [,,,] (executor-thread-1) Created new catalog class PolarisCatalog {
class Catalog {
type: INTERNAL
name: quickstart_catalog
properties: class CatalogProperties {
{default-base-location=s3://polaris-storage}
defaultBaseLocation: s3://polaris-storage
}
createTimestamp: 1763031212314
lastUpdateTimestamp: 0
entityVersion: 1
storageConfigInfo: class AwsStorageConfigInfo {
class StorageConfigInfo {
storageType: S3
allowedLocations: [s3://polaris-storage]
}
roleArn: null
externalId: null
userArn: null
region: null
endpoint: http://rgw1:7480
stsEndpoint: null
stsUnavailable: true
endpointInternal: null
pathStyleAccess: true
}
}
}
2025-11-13 10:53:32,322 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000002,POLARIS] [,,,] (executor-thread-1) 10.89.0.8 - root [13/Nov/2025:10:53:32 +0000] "POST /api/management/v1/catalogs HTTP/1.1" 201 343
2025-11-13 10:53:32,334 INFO [org.apa.pol.ser.adm.PolarisServiceImpl] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000003,POLARIS] [,,,] (executor-thread-1) Adding grant class AddGrantRequest {
grant: class CatalogGrant {
class GrantResource {
type: catalog
}
privilege: CATALOG_MANAGE_CONTENT
}
} to catalogRole catalog_admin in catalog quickstart_catalog
2025-11-13 10:53:32,347 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000003,POLARIS] [,,,] (executor-thread-1) 10.89.0.8 - root [13/Nov/2025:10:53:32 +0000] "PUT /api/management/v1/catalogs/quickstart_catalog/catalog-roles/catalog_admin/grants HTTP/1.1" 201 -
2025-11-13 10:53:32,358 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000004,POLARIS] [,,,] (executor-thread-1) 10.89.0.8 - root [13/Nov/2025:10:53:32 +0000] "GET /api/management/v1/catalogs HTTP/1.1" 200 370
2025-11-13 10:53:55,820 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000005,POLARIS] [,,,] (executor-thread-1) 10.89.0.7 - - [13/Nov/2025:10:53:55 +0000] "POST /api/catalog/v1/oauth/tokens HTTP/1.1" 200 757
2025-11-13 10:53:55,903 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000006,POLARIS] [,,,] (executor-thread-1) 10.89.0.7 - root [13/Nov/2025:10:53:55 +0000] "GET /api/catalog/v1/config?warehouse=quickstart_catalog HTTP/1.1" 200 2128
2025-11-13 10:54:01,849 INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000007,POLARIS] [,,,] (executor-thread-1) Handling runtimeException Namespace does not exist: ns
2025-11-13 10:54:01,859 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000007,POLARIS] [,,,] (executor-thread-1) 10.89.0.7 - root [13/Nov/2025:10:54:01 +0000] "GET /api/catalog/v1/quickstart_catalog/namespaces/ns HTTP/1.1" 404 97
2025-11-13 10:54:01,891 INFO [org.apa.pol.ser.cat.ice.IcebergCatalogHandler] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000008,POLARIS] [,,,] (executor-thread-1) Initializing non-federated catalog
2025-11-13 10:54:01,913 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000008,POLARIS] [,,,] (executor-thread-1) 10.89.0.7 - root [13/Nov/2025:10:54:01 +0000] "POST /api/catalog/v1/quickstart_catalog/namespaces HTTP/1.1" 200 89
2025-11-13 10:54:07,044 INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000009,POLARIS] [,,,] (executor-thread-1) Handling runtimeException Table does not exist: ns.t1
2025-11-13 10:54:07,045 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000009,POLARIS] [,,,] (executor-thread-1) 10.89.0.7 - root [13/Nov/2025:10:54:07 +0000] "GET /api/catalog/v1/quickstart_catalog/namespaces/ns/tables/t1?snapshots=all HTTP/1.1" 404 92
2025-11-13 10:54:07,079 INFO [org.apa.pol.ser.cat.ice.IcebergCatalogHandler] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000010,POLARIS] [,,,] (executor-thread-1) Initializing non-federated catalog
2025-11-13 10:54:07,083 INFO [org.apa.ice.BaseMetastoreCatalog] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000010,POLARIS] [,,,] (executor-thread-1) Table properties set at catalog level through catalog properties: {}
2025-11-13 10:54:07,084 INFO [org.apa.ice.BaseMetastoreCatalog] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000010,POLARIS] [,,,] (executor-thread-1) Table properties enforced at catalog level through catalog properties: {}
2025-11-13 10:54:07,114 INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000010,POLARIS] [,,,] (executor-thread-1) Handling runtimeException Credential vending was requested for table ns.t1, but no credentials are available
2025-11-13 10:54:07,115 INFO [io.qua.htt.access-log] [b5d3ec71-48f7-4e21-905f-2e3f32109485_0000000000000000010,POLARIS] [,,,] (executor-thread-1) 10.89.0.7 - root [13/Nov/2025:10:54:07 +0000] "POST /api/catalog/v1/quickstart_catalog/namespaces/ns/tables HTTP/1.1" 400 151
Credential vending was requested for table ns.t1, but no credentials are available
java.lang.IllegalArgumentException: Credential vending was requested for table ns.t1, but no credentials are available
at org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:230)
(same error as mentioned below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I was able to reproduce your error by adding to spark configuration:
--conf spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation="vended-credentials"
In my example i am removing this config at all, but can you please explicitly define it with empty value?
--conf spark.sql.catalog.polaris.header.X-Iceberg-Access-Delegation=""
| spark-sql ()> create namespace ns; | ||
| Time taken: 0.374 seconds | ||
| spark-sql ()> create table ns.t1 as select 'abc'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not working for me:
$ export RGW_ACCESS_KEY=POLARIS123ACCESS # Access key for Polaris S3 user
$ export RGW_SECRET_KEY=POLARIS456SECRET # Secret key for Polaris S3 user
$ spark-sql \
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0 \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
--conf spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.polaris.type=rest \
--conf spark.sql.catalog.polaris.io-impl="org.apache.iceberg.aws.s3.S3FileIO" \
--conf spark.sql.catalog.polaris.uri=http://localhost:8181/api/catalog \
--conf spark.sql.catalog.polaris.token-refresh-enabled=true \
--conf spark.sql.catalog.polaris.warehouse=quickstart_catalog \
--conf spark.sql.catalog.polaris.scope=PRINCIPAL_ROLE:ALL \
--conf spark.sql.catalog.polaris.credential=root:s3cr3t \
--conf spark.sql.catalog.polaris.client.region=irrelevant \
--conf spark.sql.catalog.polaris.s3.access-key-id=$RGW_ACCESS_KEY \
--conf spark.sql.catalog.polaris.s3.secret-access-key=$RGW_SECRET_KEY
25/11/12 07:50:56 WARN Utils: Your hostname, shark resolves to a loopback address: 127.0.1.1; using 192.168.x.x instead (on interface enp14s0)
25/11/12 07:50:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
:: loading settings :: url = jar:file:/home/snazy/.sdkman/candidates/spark/3.5.3/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml
Ivy Default Cache set to: /home/snazy/.ivy2/cache
The jars for the packages stored in: /home/snazy/.ivy2/jars
org.apache.iceberg#iceberg-spark-runtime-3.5_2.12 added as a dependency
org.apache.iceberg#iceberg-aws-bundle added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-2fdcef36-748e-42b7-815e-6aac08972a3c;1.0
confs: [default]
found org.apache.iceberg#iceberg-spark-runtime-3.5_2.12;1.9.0 in central
found org.apache.iceberg#iceberg-aws-bundle;1.9.0 in central
:: resolution report :: resolve 56ms :: artifacts dl 1ms
:: modules in use:
org.apache.iceberg#iceberg-aws-bundle;1.9.0 from central in [default]
org.apache.iceberg#iceberg-spark-runtime-3.5_2.12;1.9.0 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 2 | 0 | 0 | 0 || 2 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-2fdcef36-748e-42b7-815e-6aac08972a3c
confs: [default]
0 artifacts copied, 2 already retrieved (0kB/3ms)
25/11/12 07:50:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/11/12 07:50:57 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
25/11/12 07:50:57 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
25/11/12 07:50:58 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
25/11/12 07:50:58 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore snazy@127.0.1.1
Spark Web UI available at http://x.x.x.x:4040
Spark master: local[*], Application Id: local-1762930257016
spark-sql (default)> use polaris;
25/11/12 07:51:01 WARN AuthManagers: Inferring rest.auth.type=oauth2 since property credential was provided. Please explicitly set rest.auth.type to avoid this warning.
25/11/12 07:51:01 WARN OAuth2Manager: Iceberg REST client is missing the OAuth2 server URI configuration and defaults to http://localhost:8181/api/catalog/v1/oauth/tokens. This automatic fallback will be removed in a future Iceberg release.It is recommended to configure the OAuth2 endpoint using the 'oauth2-server-uri' property to be prepared. This warning will disappear if the OAuth2 endpoint is explicitly configured. See https://github.com/apache/iceberg/issues/10537
25/11/12 07:51:01 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Time taken: 0.566 seconds
spark-sql ()> create namespace ns;
[SCHEMA_ALREADY_EXISTS] Cannot create schema `ns` because it already exists.
Choose a different name, drop the existing schema, or add the IF NOT EXISTS clause to tolerate pre-existing schema.
spark-sql ()> create table ns.t1 as select 'abc';
25/11/12 07:51:06 ERROR SparkSQLDriver: Failed in [create table ns.t1 as select 'abc']
java.lang.IllegalArgumentException: Credential vending was requested for table ns.t1, but no credentials are available
at org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:230)
at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:123)
at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:107)
at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:215)
at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:299)
at org.apache.iceberg.rest.BaseHTTPClient.post(BaseHTTPClient.java:88)
at org.apache.iceberg.rest.RESTSessionCatalog$Builder.stageCreate(RESTSessionCatalog.java:921)
at org.apache.iceberg.rest.RESTSessionCatalog$Builder.createTransaction(RESTSessionCatalog.java:799)
at org.apache.iceberg.CachingCatalog$CachingTableBuilder.createTransaction(CachingCatalog.java:282)
at org.apache.iceberg.spark.SparkCatalog.stageCreate(SparkCatalog.java:265)
at org.apache.spark.sql.connector.catalog.StagingTableCatalog.stageCreate(StagingTableCatalog.java:94)
at org.apache.spark.sql.execution.datasources.v2.AtomicCreateTableAsSelectExec.run(WriteToDataSourceV2Exec.scala:121)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:691)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:682)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:713)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:744)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:68)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:501)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:619)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:613)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:613)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:310)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:75)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:52)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Credential vending was requested for table ns.t1, but no credentials are available
java.lang.IllegalArgumentException: Credential vending was requested for table ns.t1, but no credentials are available
at org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:230)
at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:123)
at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:107)
at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:215)
at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:299)
at org.apache.iceberg.rest.BaseHTTPClient.post(BaseHTTPClient.java:88)
at org.apache.iceberg.rest.RESTSessionCatalog$Builder.stageCreate(RESTSessionCatalog.java:921)
at org.apache.iceberg.rest.RESTSessionCatalog$Builder.createTransaction(RESTSessionCatalog.java:799)
at org.apache.iceberg.CachingCatalog$CachingTableBuilder.createTransaction(CachingCatalog.java:282)
at org.apache.iceberg.spark.SparkCatalog.stageCreate(SparkCatalog.java:265)
at org.apache.spark.sql.connector.catalog.StagingTableCatalog.stageCreate(StagingTableCatalog.java:94)
at org.apache.spark.sql.execution.datasources.v2.AtomicCreateTableAsSelectExec.run(WriteToDataSourceV2Exec.scala:121)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:691)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:682)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:713)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:744)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:68)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:501)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:619)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:613)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:613)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:310)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:75)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:52)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
This PR introduces a complete Docker Compose–based Ceph cluster environment for Polaris integration.
The goal is to make it easy to simplify the developer experience for Polaris contributors and testers who want to:
Experiment with Ceph-based S3 storage locally — including MON, MGR, OSD, RGW
Validate Polaris integration against RGW without a full cluster deployment