Skip to content

Conversation

@dimas-b
Copy link
Contributor

@dimas-b dimas-b commented Jul 8, 2025

This change allows configuring the "path-style" access mode in S3 clients (both in Polaris Servers and Iceberg REST Catalog API clients).

This change is applicable both to AWS storage and to non-AWS S3-compatible storage (#1530).

The Management API change is backward-compatible with clients lenient to unknown JSON properties.

Includes a fix to AWS config object round-trip via generated API classes.

Dev ML discussion: https://lists.apache.org/thread/oxy2yd4mbq06zr4z57lfptgy95c3lyhb

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use the default here, or just leave it null to leverage the default used in orElse. Better not to proliferate the constants around like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to use PATH_STYLE_ACCESS_DEFAULT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have a default at the API level, is this actually nullable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove the API default.

snazy
snazy previously approved these changes Jul 9, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jul 9, 2025
singhpk234
singhpk234 previously approved these changes Jul 9, 2025
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@dimas-b
Copy link
Contributor Author

dimas-b commented Jul 9, 2025

I had to remove one commit (discussed in the API yaml comment thread). PTAL.

Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clarify if this is a breaking change (does it need to target 2.0? Go into API version 2?) and plan accordingly before merging.

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board Jul 10, 2025
snazy
snazy previously approved these changes Jul 10, 2025
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dimas-b
Copy link
Contributor Author

dimas-b commented Jul 14, 2025

@eric-maynard : I believe I addressed your backward-compatibility concern in the latest push. Could you review again?

@dimas-b dimas-b force-pushed the s3-path-style branch 2 times, most recently from 6157bc0 to 85bbabe Compare July 15, 2025 16:27
dimas-b added 3 commits July 15, 2025 19:03
This change allows configuring the "path-style" access
mode in S3 clients (both in Polaris Servers and Iceberg
REST Catalog API clients).

This change is applicable both to AWS storage and to
non-AWS S3-compatible storage (apache#1530).

The Management API change is backward-compatible.
eric-maynard
eric-maynard previously approved these changes Jul 16, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jul 16, 2025
type: boolean
description: >-
Whether S3 requests to files in this catalog should use 'path-style addressing for buckets'.
Default: false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: With the default in the spec, I don't think we need the default in the doc. Other similar variables don't have it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed 👍

@dimas-b dimas-b merged commit 987c554 into apache:main Jul 16, 2025
12 checks passed
@dimas-b dimas-b deleted the s3-path-style branch July 16, 2025 15:46
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jul 16, 2025
snazy added a commit to snazy/polaris that referenced this pull request Jul 17, 2025
snazy added a commit that referenced this pull request Jul 21, 2025
…ess (#2127)

This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client.

Amends #1913 and #2012
ashevchuk123 pushed a commit to mapr/polaris that referenced this pull request Oct 27, 2025
* Add `pathStyleAccess` to AwsStorageConfigInfo

This change allows configuring the "path-style" access
mode in S3 clients (both in Polaris Servers and Iceberg
REST Catalog API clients).

This change is applicable both to AWS storage and to
non-AWS S3-compatible storage (apache#1530).

(cherry picked from commit 987c554)
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* chore(deps): update dependency mypy to >=1.17, <=1.17.0 (apache#2114)

* Spark 3.5.6 and Iceberg 1.9.1 (apache#1960)

* Spark 3.5.6 and Iceberg 1.9.1

* Cleanup

* Add `pathStyleAccess` to AwsStorageConfigInfo (apache#2012)

* Add `pathStyleAccess` to AwsStorageConfigInfo

This change allows configuring the "path-style" access
mode in S3 clients (both in Polaris Servers and Iceberg
REST Catalog API clients).

This change is applicable both to AWS storage and to
non-AWS S3-compatible storage (apache#1530).

* Add TestFileIOFactory helper (apache#2105)

* Add FileIOFactory.wrapExisting helper

* fix(deps): update dependency gradle.plugin.org.jetbrains.gradle.plugin.idea-ext:gradle-idea-ext to v1.2 (apache#2125)

* fix(deps): update dependency boto3 to v1.39.7 (apache#2124)

* Abstract polaris-runtime-service tests for all persistence implementations (apache#2106)

The NoSQL persistence implementation has to run the Iceberg table & view catalog plus the Polaris specific tests as well. Reusing existing tests is beneficial to avoid a lot of code duplcation.

This change moves the actual tests to `Abstract*` classes and refactors the existing tests to extend those. The NoSQL persistence work extends the same `Abstract*` classes but runs with different Quarkus test profiles.

* Add IMPLICIT authentication support to the CLI (apache#2121)

PRs apache#1925 and apache#1912 were merged around the same time.  This PR connects the two changes and enables the CLI to accept IMPLICIT authentication type. 

Since Hadoop federated catalogs rely purely on IMPLICIT authentication, the CLI parsing test has been updated to reflect the same.

* feat(helm): Add support for external authentication (apache#2104)

* fix(deps): update dependency org.apache.iceberg:iceberg-bom to v1.9.2 (apache#2126)

* fix(deps): update quarkus platform and group to v3.24.4 (apache#2128)

* fix(deps): update dependency boto3 to v1.39.8 (apache#2129)

* fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.13.3 (apache#2130)

* Add newIcebergCatalog helper (apache#2134)

creation of `IcebergCatalog` instances was quite redundant as tests
mostly use the same parameters most of the time.

also remove an unused field in 2 other tests.

* Add server and client support for the new generic table `baseLocation` field (apache#2122)

* Use Makefile to simplify setup and commands (apache#2027)

* Use Makefile to simplify setup and commands

* Add targets for minikube state management

* Add podman support and spark plugin build

* Add version target

* Update README.md for Makefile usage and relation to the project

* Fix nit

* Package polaris client as python package (apache#2049)

* Package polaris client as python package

* Package polaris client as python package

* Change owner to spark when copying files from local into Dockerfile

* CI: Address failure from accessing GH API (apache#2132)

CI sometimes fails with this failure:
```
* What went wrong:
Execution failed for task ':generatePomFileForMavenPublication'.
> Unable to process url: https://api.github.com/repos/apache/polaris/contributors?per_page=1000
```

The sometimes failing request fetches the list of contributors to be published in the "root" POM. Unauthorized GH API requests have an hourly(?) limit of 60 requests per source IP. Authorized requests have a much higher rate limit. We do have a GitHub token available in every CI run, which can be used in GH API requests. This change adds the `Authorization` header for the failing GH API request to leverage the higher rate limit and let CI not fail (that often).

* fix(deps): update dependency com.nimbusds:nimbus-jose-jwt to v10.4 (apache#2139)

* fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.0 (apache#2142)

* fix(deps): update dependency software.amazon.awssdk:bom to v2.32.4 (apache#2146)

* fix(deps): update dependency org.xerial.snappy:snappy-java to v1.1.10.8 (apache#2138)

* fix(deps): update dependency org.junit:junit-bom to v5.13.4 (apache#2147)

* fix(deps): update dependency boto3 to v1.39.9 (apache#2137)

* fix(deps): update dependency com.fasterxml.jackson:jackson-bom to v2.19.2 (apache#2136)

* Python client: add support for endpoint, sts-endpoint, path-style-access (apache#2127)

This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client.

Amends apache#1913 and apache#2012

* Remove PolarisEntityManager.getCredentialCache (apache#2133)

`PolarisEntityManager` itself is not using the `StorageCredentialCache` but just hands it out via `getCredentialCache`.
the only caller of `getCredentialCache` is `FileIOUtil.refreshAccessConfig`, which in in turn is only called by `DefaultFileIOFactory` and `IcebergCatalog`.

note that in a follow-up we will likely be able to remove `PolarisEntityManager` usage completely from `IcebergCatalog`.

additional cleanups:
- use `StorageCredentialCache` injection in tests (but we need to invalidate all entries on test start)
- remove unused `UserSecretsManagerFactory` from `PolarisCallContextCatalogFactory`

* chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.22-1.1752676419 (apache#2150)

* fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.1 (apache#2152)

* fix(deps): update dependency boto3 to v1.39.10 (apache#2151)

* chore: fix class reference in the javadoc of TableLikeEntity (apache#2157)

* fix(deps): update dependency commons-codec:commons-codec to v1.19.0 (apache#2160)

* fix(deps): update dependency boto3 to v1.39.11 (apache#2159)

* Last merged commit 395459f

---------

Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Pooja Nilangekar <poojan@umd.edu>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Yun Zou <yunzou.colostate@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants