Skip to content

Conversation

@dimas-b
Copy link
Contributor

@dimas-b dimas-b commented Oct 17, 2025

CLIENT_REGION is not a credential value, which is in line with Iceberg's VendedCredentialsProvider code.

Cf. apache/iceberg#11389

`CLIENT_REGION` is not a credential value, which is in line with
Iceberg's `VendedCredentialsProvider` code.

Cf. apache/iceberg#11389
@dimas-b dimas-b force-pushed the client-region-non-cred branch from 1b98491 to 10204dc Compare October 17, 2025 17:35
@dimas-b dimas-b requested a review from HonahX October 17, 2025 17:36
Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @dimas-b. To confirm the context, the s3.region was included in credentials for #342 because iceberg by that time does not support s3 cross-region access yet. That was later added in apache/iceberg@3d9fc1d and released with iceberg 1.7.0.

Would this be a breaking change for users since the s3.cross-region-access-enabled is still default to be false? I assume users who are relying on this information in vended credential to avoid cross-region call will encounter failure after this PR?

- `s3.secret-access-key`: secret for credentials that provide access to data in S3
- `s3.session-token
*/
StorageAccessProperty(Class valueType, String propertyName, String description) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here to make the enum declaration more explicit about the isCredential setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@dimas-b
Copy link
Contributor Author

dimas-b commented Oct 17, 2025

@HonahX : s3.region is still included into AccessConfig and by extension into LoadTable responses, but as an ordinary property.

Is that sufficient for Iceberg clients? What's the best way to test it?

For background: the reason for this PR is that the "nice error message" from #2711 does not happen if the region is set in storage config (but no credentials are vended).

@HonahX
Copy link
Contributor

HonahX commented Oct 17, 2025

@dimas-b Thanks for the pointer and context. Iceberg's LoadTableResponse have different fields holding normal config and credentials:
https://github.com/apache/iceberg/blob/03f2af8ab465d4a5238753be206bcf366130f9a0/open-api/rest-catalog-open-api.yaml#L3321-L3328

The RestCatalog client only use things in the storage-credentials fields to initialize the FileIO, where the vended credential is used.

If client.region is in extra property, it will [not be added](builder: https://github.com/apache/iceberg/blob/03f2af8ab465d4a5238753be206bcf366130f9a0/core/src/main/java/org/apache/iceberg/rest/responses/LoadTableResponse.java#L117-L130) to credential field and thus the RestCatalog client cannot pick up that config when initializing the FileIO.

@HonahX
Copy link
Contributor

HonahX commented Oct 17, 2025

For background: the reason for this PR is that the "nice error message" from #2711 does not happen if the region is set in storage config (but no credentials are vended).

Would it be easier if consider the credential being "empty" if only CLIENT_REGION is there?

@dimas-b
Copy link
Contributor Author

dimas-b commented Oct 20, 2025

@HonahX : I tested with Spark 3.5 + Iceberg 1.6.1 and AWS with the code in this PR... all seems to work well... WDYT?

Note: Spark did not have any AWS FileIO config. It got everything from Polaris.

Here's a sample load table response for reference:

{
  "metadata-location": "s3://***/pol/ns/t1/metadata/00000-c8eeb778-d0ca-42ff-b48c-743f16751b7c.metadata.json",
  "metadata": {
    "format-version": 2,
    "table-uuid": "5403e77b-dbd4-425a-ba20-969bf448462b",
    "location": "s3://***/pol/ns/t1",
[...]
  },
  "config": {
    "s3.access-key-id": "***",
    "s3.secret-access-key": "****",
    "s3.session-token": "*****",
    "client.refresh-credentials-endpoint": "v1/polaris/namespaces/ns/tables/t1/credentials",
    "expiration-time": "1760987329000",
    "s3.session-token-expires-at-ms": "1760987329000",
    "client.region": "us-west-2"
  },
  "storage-credentials": [
    {
      "prefix": "s3://***/pol/ns/t1",
      "config": {
        "s3.access-key-id": "***",
        "s3.secret-access-key": "****",
        "s3.session-token": "*****",
        "expiration-time": "1760987329000",
        "s3.session-token-expires-at-ms": "1760987329000"
      }
    }
  ]
}

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b Thanks for the test and examples. I also did one with Spark 3.5 + Iceberg 1.9.1. Apparently I missed some details when reading through the code. You are right! The client.region will still take effect in the config field. LGTM!

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Oct 21, 2025
@HonahX HonahX merged commit acdccae into apache:main Oct 21, 2025
15 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Oct 21, 2025
@dimas-b dimas-b deleted the client-region-non-cred branch October 21, 2025 17:11
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Update Quarkus Platform and Group to v3.28.4 (apache#2786)

* Update dependency org.testcontainers:testcontainers-bom to v2.0.1 (apache#2830)

* Build/polaris-core: Remove outdated `constraint`s (apache#2818)

The `:polaris-core` build scripts contains (soft) version-constraints for some dependencies with a vague reason "Vulnerability detected in ..." (concrete CVE/reason not mentioned) referencing specific dependency versions. The mentioned versions are all quite outdated, some are even not transitively referenced. Hence, removing those constraings, as those seem no longer relevant.

Effective dependency versions can be inspected via `./gradlew :polaris-core:dependencies --configuration runtimeClasspath`.

* Add Community Meetings for 2025-10-02 and 2025-10-16 (apache#2832)

* Update docker.io/prom/prometheus Docker tag to v3.7.1 (apache#2834)

* testcontainers v2: tackle deprecation warnings (apache#2835)

* Add findPrincipalById helper (apache#2810)

* Add findPrincipalById helper

this simplifies frequent usage of the lower level `loadEntity` api (similar to the
existing `findPrincipalByName` helper)

* [Python] Add more tests cases for policy CLI (apache#2831)

* Update dependency software.amazon.awssdk:bom to v2.35.10 (apache#2840)

* Update dependency ch.qos.logback:logback-classic to v1.5.20 (apache#2839)

* Reproducible builds: make parent pom content reproducible (apache#2826)

The parent pom contains the `<developer>` and `<contributor>` elements. The former is populated from ASF people information including role information (champion, mentor, chair, (P)PMC member, committer). The latter is retrieved from a GitHub API endpoint, ordered by number contributions. Especially the latter list is prone to vary between builds, which makes the parent pom not reproducible as the locally built one is likely different from the one that was built by the release managed (staged artifact).

This change removes both lists, leaving a single static `<developer>` entry pointing to `https://polaris.apache.org/community/`. Related build-script code has been updated and no longer retrieves people information.

* Log root cause exceptions in mappers (apache#2837)

Fix `IcebergExceptionMapper` and `PolarisExceptionMapper` to pass exceptions as "cause" to the logger (as opposed to unreferenced log parameters).

* Remove credential flag from `StorageAccessProperty.CLIENT_REGION` (apache#2838)

`CLIENT_REGION` is not a credential value, which is in line with
Iceberg's `VendedCredentialsProvider` code.

Cf. apache/iceberg#11389

* CI: Let all workflows use GitHub's docker.io mirror (apache#2841)

* Correct template rendering for authentication options (apache#2808)

* Correct template rendering for authentication options

* Added tpl back

* Increase javadoc visibility in `:polaris-async-vertx` (apache#2745)

This is to fix javadoc error: `No public or protected classes found to document`

* Update slack invite url (apache#2846)

* Remove unused ConcurrentLinkedQueueWithApproximateSize (apache#2849)

* Merge AwsCloudWatchConfiguration and QuarkusAwsCloudWatchConfiguration (apache#2848)

For some reason, these two classes weren't properly merged when the runtime-service and service-common modules were merged. This PR fixes that.

This PR also adds some examples of AWS Cloud Watch configuration to the default application.properties file.

* Move TestPolarisEventListener to test fixtures (apache#2850)

* Update dependency com.google.cloud:google-cloud-storage-bom to v2.59.0 (apache#2857)

* Update actions/stale digest to e46bbab (apache#2856)

* Servcie: Remove a duplicated config (apache#2854)

* Update docker.io/prom/prometheus Docker tag to v3.7.2 (apache#2858)

* Update Quarkus Platform and Group to v3.28.5 (apache#2859)

* Update dependency com.google.errorprone:error_prone_core to v2.43.0 (apache#2860)

* Add --no-sts to CLI (apache#2855)

* Add --no-sts to CLI

Following up on apache#2672, add new `--no-sts` option to CLI to allow
configuring `stsUnavailable` in `AwsStorageConfigInfo`

* Use AccessConfigProvider.getAccessConfig in DefaultFileIOFactory (apache#2852)

* CLI: Remove the trailing comma (apache#2863)

* Update dependency pip-licenses-cli to v3 (apache#2842)

* Update dependency pip-licenses-cli to v3

* Update pip-licenses-cli version format

* Fix pip-licenses-cli version specification

---------

Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>

* Update quay.io/keycloak/keycloak Docker tag to v26.4.2 (apache#2868)

* Bump main to 1.3.0-SNAPSHOT (apache#2870)

* Add properties from TableMetadata into Table entity internalProperties (apache#2735)

* Add properties from TableMetadata into Table entity internalProperties

* Made table properties constants and pulled out static utility method

* Update dependency io.smallrye:jandex to v3.5.1 (apache#2872)

* Fix exec flags on getting-started scripts (apache#2878)

* Add `+x` to script source files
* Remove (unnecessary) `chmod` from docs

* Update plugin jcstress to v0.9.0 (apache#2882)

* Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.23-6.1761164966 (apache#2874)

* Update dependency openapi-generator-cli to v7.16.0 (apache#2703)

* Update Gradle to v9 (apache#2226)

* Update Gradle to v9

* adopt gradlew

---------

Co-authored-by: Robert Stupp <snazy@snazy.de>

* Last merged commit 7892540

---------

Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: JB Onofré <jbonofre@apache.org>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Nuoya Jiang <98131931+NuoyaJiang@users.noreply.github.com>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>
Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Yufei Gu <yufei@apache.org>
Co-authored-by: Nuoya Jiang <98131931+CodingBangboo@users.noreply.github.com>
Co-authored-by: Michael Collado <40346148+collado-mike@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants