Skip to content

Conversation

@adnanhemani
Copy link
Contributor

This PR makes two changes:

  1. Generate a Request ID for all incoming requests. It will prioritize using a Request ID specified by the incoming request - but if one is not specified, then a new one is generated. The Request ID is therefore always available in the ContainerRequestContext downstream.
  2. Creates a new ContainerResponseFilter where the Request ID is added as a header to all responses.

I'm having some trouble with the naming of the classes, please do help suggest better names!

@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Sep 18, 2025
@adnanhemani adnanhemani marked this pull request as ready for review September 18, 2025 08:00
if (requestId == null) {
requestId = UUID.randomUUID().toString();
}
MDC.put(REQUEST_ID_KEY, requestId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm admittedly nit-picking a bit 😅 but I'd argue that the MDC.put statement should stay in LoggingMDCFilter since it's about logging.

What should be moved here is just the rc.setProperty call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can agree to that :) Will adjust in this revision.


@Provider
public class RequestIdResponseFilter implements ContainerResponseFilter {
private static final String REQUEST_ID_HEADER = "X-Request-Id";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The X- prefix is deprecated, FYI. See:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers

But more importantly, why not use the header name used for incoming requests, Polaris-Request-Id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL "X-" is deprecated! Thanks!

We can - I thought this was more of a standardized name prior. Let me change this.

return Map.of(
"polaris.log.request-id-header-name",
REQUEST_ID_HEADER,
"polaris.bootstrap.credentials",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. There is no Quarkus configuration named polaris.bootstrap.credentials.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd - I found it from a different test and the README as well.

I've removed it and the test still works 🤔 I can investigate this later as this is not the point of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README is correct, but the other test is wrong indeed.

adutra
adutra previously approved these changes Sep 22, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Sep 22, 2025
public void filter(ContainerRequestContext rc) {
var requestId = rc.getHeaderString(loggingConfiguration.requestIdHeaderName());
if (requestId == null) {
requestId = UUID.randomUUID().toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is exhausting the randomness pool a concern? @snazy : WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Also: UUID.randomUUID() is in theory a blocking call (happens only when the entropy source is empty though), and therefore should be avoided in filters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the request ID does not have to be a UUID (I'm a bit out-of-date on this) it may be worth using a per-node (or per-thread) UUID (allocated one per restart) plus a simple counter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions:

  1. Use org.apache.polaris.ids.impl.SnowflakeIdGeneratorImpl#idToTimeUuid
  2. Use a simple AtomicLong counter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also UUID v7 may be worth considering (optional, for follow-up): https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-04.html#name-uuid-version-7

Copy link
Contributor Author

@adnanhemani adnanhemani Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the RequestIdGenerator, which is a very, very simple ID generator - since we don't require the level of complexity that has been made in the SnowflakeIdGenerator. It's just a quick implementation of @dimas-b's suggestion above:

If the request ID does not have to be a UUID (I'm a bit out-of-date on this) it may be worth using a per-node (or per-thread) UUID (allocated one per restart) plus a simple counter.

I'm very hesitant to introduce a complete dependency on the NoSql Persistence for the Service module and so creating the RequestIdGenerator is my minor effort to avoid doing that. I'm sure there may be a better reason for introducing this dependency in the future, but I don't want to sidetrack the simple goal (and simple requirements) of this PR with whether we should introduce this dependency or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that UUID.randomUUID() is blocking call and in what chance will it exhaust the system entropy?

I've never got a satisfying answer for that question 😄

Most commenters stress the fact that dev/urandom never blocks.

This is certainly true, but doesn't address the fact that the entropy pool might get exhausted at some point, in which case your UUIDs will have very poor randomness.

Furthermore, the UUID.randomUUID() call is explicitly considered blocking by BlockHound:

reactor/BlockHound#157

I think that this is due to the fact that new SecureRandom() does some magic to select the random numbers provider, and even if the default provider doesn't block (it plugs into /dev/urandom), you still can configure your JVM with a different provider that could block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s a post that shows how to work around BlockHound: https://stackoverflow.com/a/75687886/933856.

That said, we probably shouldn’t make decisions based on a single anecdote or assumptions about JVM configurations. What if a user runs Polaris on their own JVM and it breaks? That scenario is very likely. And what if a future JVM introduces a breaking change?

Do we need to worry about that now? Probably not. It feels like a premature optimization at this stage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quarkus doesn't use BlockHound – I cited BlockHound as an example. But Quarkus does have a mechanism to detect blocking calls in non-blocking contexts, we already had the problem with RealmContextFilter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted that link to demonstrate that people disagree with BlockHound on whether it is blocking call. Does Quarkus consider also UUID.randomUUID() as a blocking-call? If not, we got another reason to optimize it later.


@Test
public void testRequestIdHeaderNotSpecifiedAndCounterExhausted() {
requestIdGenerator.setCounter(Long.MAX_VALUE / 2 + 1);
Copy link
Contributor

@adutra adutra Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is actually testing the internals of RequestIdGenerator. I'd prefer to move this to a new RequestIdGeneratorTest class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the next revision!

assertThat(currentRequestIdBase).isNotEqualTo(uuidBase);
}

private boolean isValidDefaultUUID(String str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I'm not sure we need to verify that request IDs have any particular structure. For all intents and purposes, the ID is an opaque string and it doesn't matter how it's generated, as long as it is unique.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - I'm still testing the format in RequestIdGenerator (just to make sure that what we expect to happen is still happening), but removed it from here.

@Priority(FilterPriorities.REQUEST_ID_FILTER)
@Provider
public class RequestIdFilter implements ContainerRequestFilter {
@Inject RequestIdGenerator requestIdGenerator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this line down to line 40


@ApplicationScoped
public class RequestIdGenerator {
private static String BASE_DEFAULT_UUID = UUID.randomUUID().toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static String BASE_DEFAULT_UUID = UUID.randomUUID().toString();
private static String baseDefaultUuid = UUID.randomUUID().toString();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thanks!


public String generateRequestId() {
String requestId = BASE_DEFAULT_UUID + "_" + COUNTER.incrementAndGet();
if (COUNTER.get() >= COUNTER_SOFT_MAX) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many issues here:

  1. COUNTER.get() will not necessarily return what COUNTER.incrementAndGet() returned in the previous line.
  2. Many threads may enter this if block in parallel, and will all try to assign the BASE_DEFAULT_UUID field.
  3. BASE_DEFAULT_UUID is not volatile nor atomic, so the value write on line 38 may not be immediately visible to other threads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you are right and I'm embarrassed that I didn't catch these. I've fixed this in the next revision using an AtomicBoolean as a way to track whether a reset is in progress. Please let me know if I've missed anything else.

Copy link
Contributor Author

@adnanhemani adnanhemani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded to comments from @adutra . Please let me know if there is anything else!


@Test
public void testRequestIdHeaderNotSpecifiedAndCounterExhausted() {
requestIdGenerator.setCounter(Long.MAX_VALUE / 2 + 1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the next revision!

assertThat(currentRequestIdBase).isNotEqualTo(uuidBase);
}

private boolean isValidDefaultUUID(String str) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - I'm still testing the format in RequestIdGenerator (just to make sure that what we expect to happen is still happening), but removed it from here.


@ApplicationScoped
public class RequestIdGenerator {
private static String BASE_DEFAULT_UUID = UUID.randomUUID().toString();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thanks!


public String generateRequestId() {
String requestId = BASE_DEFAULT_UUID + "_" + COUNTER.incrementAndGet();
if (COUNTER.get() >= COUNTER_SOFT_MAX) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, you are right and I'm embarrassed that I didn't catch these. I've fixed this in the next revision using an AtomicBoolean as a way to track whether a reset is in progress. Please let me know if I've missed anything else.

import jakarta.ws.rs.Priorities;

public final class FilterPriorities {
public static final int REQUEST_ID_FILTER = Priorities.AUTHENTICATION - 101;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: -> REALM_CONTEXT_FILTER - 1 to make it more readable?

flyrain
flyrain previously approved these changes Sep 26, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

flyrain
flyrain previously approved these changes Sep 26, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Thanks @adnanhemani for the change!

}

State increment() {
return counter >= COUNTER_SOFT_MAX ? new State() : new State(uuid, counter + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why COUNTER_SOFT_MAX ? There is no risk of overflowing here.


@Test
void testCounterIncrementsSequentially() {
// requestIdGenerator.setCounter(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dangling comment.

}

@Test
void testSetCounterChangesNextGeneratedId() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this test has any value, setCounter is not meant to be called outside of tests, and besides, I think we can remove that method.

}

@VisibleForTesting
public void setCounter(long counter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this method, RequestIdGenerator.state is already package-private.

}

@Test
void testGenerateRequestId_ReturnsUniqueIds() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test could be made multi-threaded.

@adutra
Copy link
Contributor

adutra commented Sep 30, 2025

In order to move things forward, and since @flyrain already approved, I'm going to merge this PR now even if my latest comments are still open. I will address them myself in a follow-up PR.

@adutra adutra merged commit 2f0c7a4 into apache:main Sep 30, 2025
14 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Sep 30, 2025
adutra added a commit to adutra/polaris that referenced this pull request Sep 30, 2025
Summary of changes:

1. Extracted an interface from `RequestIdGenerator`.
2. The `generateRequestId` method now returns a `CompletionStage<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation.
3. Also addressed comments in apache#2602.
adutra added a commit to adutra/polaris that referenced this pull request Sep 30, 2025
Summary of changes:

1. Extracted an interface from `RequestIdGenerator`.
2. The `generateRequestId` method now returns a `Uni<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation.
3. Also addressed comments in apache#2602.
adutra added a commit to adutra/polaris that referenced this pull request Sep 30, 2025
Summary of changes:

1. Extracted an interface from `RequestIdGenerator`.
2. The `generateRequestId` method now returns a `Uni<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation.
3. Also addressed comments in apache#2602.
adutra added a commit that referenced this pull request Oct 3, 2025
Summary of changes:

1. Extracted an interface from `RequestIdGenerator`.
2. The `generateRequestId` method now returns a `Uni<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation.
3. Also addressed comments in #2602.
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* (Based on PR#2223)Support Namespace/Table level RBAC for external passthrough catalogs (apache#2673)

Creates missing synthetic entities for securables in external passthrough catalogs.
Based on Option 1 discussed in the RBAC section of catalog federation design doc.

In the future, we could remove calls to PolarisEntity.Builder() and replace them with entities fetched from the remote catalog. (enabling Option 2).

---------

Co-authored-by: Pooja Nilangekar <poojan@umd.edu>

* Docs: Add more details about v1 schema user to upgrade from 1.0 to 1.1 (apache#2674)

* Site: The link https://iceberg.apache.org/concepts/catalog/ doesn't exist anymore. (apache#2683)

* Docs: Add analytics for polaris.apache.org (apache#2676)

* Make ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS configurable per catalog (apache#2688)

* Update ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS to be configurable per catalog

* chore(deps): update postgres docker tag to v18 (apache#2692)

* fix(deps): update dependency org.eclipse.persistence:eclipselink to v4.0.8 (apache#2682)

* fix(deps): update dependency org.apache.logging.log4j:log4j-core to v2.25.2 (apache#2646)

* chore(deps): update dependency openapi-generator-cli to v7.15.0 (apache#2410)

* chore(deps): update dependency io.quarkus to v3.27.0 (apache#2663)

Co-authored-by: Mend Renovate <bot@renovateapp.com>

* Publish Develocity builds scans for PRs and local use (apache#2596)

This PR enables Develocity build scans for all PRs and contributors w/o an Apache account.

CI build scans in the `apache/polaris` repo against branches and tags and having access to the ASF's Develocity secret continue to publish to the ASF's Develocity instance (no behavioral change).

All other build scans are published to Gradle's public Develocity instance:
- Build scans from local developer (non-CI) runs are only published, if Gradle is invoked with the `--scan` option.
- Build scans from or targeting another repository than `apache/polaris` do need be enabled explicity by accepting Gradle's terms of service, via a repository variable, because this is a decision of the owner of a repository.

Advanced options to configure another Develocity server or project-ID are available (for non-`apache/polaris` repositories).

Detailed instructions in the `README.md`.

* Fix & enhancements to the Events API hierarchy (apache#2629)

Summary of changes:

- Turned `PolarisEventListener` into an interface to facilitate implementation / mocking
- Added missing `implements PolarisEvent` to many event records
- Removed unused method overrides
- Added missing method overrides to `TestPolarisEventListener`

* fix(deps): update dependency org.kordamp.gradle:jandex-gradle-plugin to v2.3.0 (apache#2694)

* Auth: reorganize internal authentication components (apache#2634)

This PR contains no functional and no user-facing change. It is merely a refactor to better organize auth code.

Summary of changes:

- Moved all internal authentication components to the `org.apache.polaris.service.auth.internal` package and subpackages
- Reduced visibility of utility classes
- Renamed `TokenBroker` class hierarchy to stick to the naming standard: `<Algorithm>JWTBroker`
- Introduced `@PolarisImmutable` whenever appropriate
- Removed unused `NoneTokenBrokerFactory` (we already have `DisabledOAuth2ApiService`)
- Removed unused `TokenBrokerFactoryConfig`

* Enhancement : adding support for Aurora postgres AWS IAM authentication (apache#2650)

Add support for postgres AWS IAM authentication using the `apache-client` lib.

* Remove unused `name` arg from findCatalogByName in PolarisAdminService (apache#2691)

* remove unused name param

* Rename for better readability

* Fix a race condition in sendNotification where concurrent parent-namespace creation causes failures (apache#2693)

* Fix a race condition in sendNotification where concurrent parent-namespace creation causes failures

The semantics of the createNonExistingNamespaces method used during sendNotification were supposed
to be "create if needed". However, the behavior ended up surfacing an AlreadyExistsException
if multiple concurrent sendNotification attempts were made for a brand-new namespace (where
the notifications may be different tables). This would cause a table sync to fail if a sibling
table was being synced at the same time, even though the new table should successfully get created
under the shared namespace.

* Also better future-proof the createNamespaceInternal logic by explicitly
checking for ENTITY_ALREADY_EXISTS, per review suggestion.

Log a less scary message since it's not an error scenario type of race
condition, per review suggestion

* Client: add credential reset option (apache#2698)

* Client: add credential reset option

* Client: add credential reset option

* Client: add credential reset option

* Add integration testing

* Fix lint

* fix(deps): update dependency software.amazon.awssdk:bom to v2.34.5 (apache#2702)

* fix(deps): update dependency com.gradleup.shadow:shadow-gradle-plugin to v9.2.2 (apache#2661)

* Support S3 storage that does not have STS (apache#2672)

* Support S3 storage that does not have STS

This change is backward compatible with old catalogs that have storage configuration for S3 systems with STS.

* Add new property to S3 storage config: `stsUnavailable` (defaults to "available").

* Do not call STS when unavailable in `AwsCredentialsStorageIntegration`, but still put other properties (e.g. s3.endpoint) into `AccessConfig`

Relates to apache#2615
Relates apache#2207

* Docs/improve idp documentation (apache#2695)

* Fix Github links in IDP documentation

* Separate IDP docs for usage and development

* - Add telemetry config example
- Fix link to getting started from landing page
- Fix mentioning role-arn as required

* Fix some relative links (local Hugo resolves them properly, but PR auto checks still fails)

* Docs: narrow down --role-arn usage for AWS S3 only; fix a link in keycloak guide.

* Docs: fix a link in keycloak guide.

* chore(deps): update gradle/actions digest to 748248d (apache#2708)

* Client: fix integration testing (apache#2700)

* Add fallback in case the VERSION table is not present (apache#2653)

* initial commit

* wire up

* pastefix

* change to postgres specific code

* [Catalog Federation] Add feature flag to disallow setting sub-RBAC for federated catalog at catalog level (apache#2696)

In apache#2688 (comment), we've identified that configuring polaris.config.enable-sub-catalog-rbac-for-federated-catalogs at catalog level should not be allowed in all cases, especially when the owner is not the same subject as the catalog user or admin.

This PR add a feature flag, ALLOW_SETTING_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS to allow owner to disable catalog level setting polaris.config.enable-sub-catalog-rbac-for-federated-catalogs

* Fix `delegationModes` parameter propagation in `createTableStaged()` (apache#2713)

This is follow-up bugfix for apache#2589

The bugfix part apache#2711 is extracted here since apache#2711 proved to be
non-trivial and may require extra time.

* Use the `delegationModes` method parameter as intended (as opposed
  to a local constant).

* Generate Request IDs (if not specified); Return Request ID as a Header (apache#2602)

* fix(deps): update dependency org.junit:junit-bom to v5.14.0 (apache#2715)

* NoSQL persistence: add Java/Vert.X executor abstraction layer (apache#2527)

Provides an abstraction to submit asynchronous tasks, optionally with a delay or delay + repetition and implementations based on Java's `ThreadPoolExecutor` and Vert.X.

* Fix RDS devservices config + adopt for `:polaris-admin:test` (apache#2723)

Changes:
* Disables devservices for `:polaris-admin` tests as well, which is necessary to _not_ spin up test containers.
* Use the explicit devservices-config as everywhere else.

The first bullet point can cause excessive memory usage, especially with more test classes, eventually killing the whole GH runner.

* fix(deps): update dependency io.smallrye:jandex to v3.5.0 (apache#2722)

* fix(deps): update dependency org.jboss.weld:weld-junit5 to v5.0.2.final (apache#2721)

* chore(deps): update quay.io/keycloak/keycloak docker tag to v26.4.0 (apache#2719)

* Last merged commit 4024557

* NoSQL: Minor-ish changes to "nodes" projects

Adopt nodes projects to OSS PR content

* NoSQL: adapt to async package rename

* Build: remove unnecessary explicit vertx-core dependency

The async-vertx implementation should not propagate a different Vert.X dependency than Quarkus provides. This wouldn't be an issue if we could just use `enforcedPlatform()` for all Quarkus-builds, but sadly we cannot for the spark-plugin-inttests.

---------

Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
Co-authored-by: Pooja Nilangekar <poojan@umd.edu>
Co-authored-by: Prashant Singh <35593236+singhpk234@users.noreply.github.com>
Co-authored-by: JB Onofré <jbonofre@apache.org>
Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: fabio-rizzo-01 <fabio.rizzocascio@jpmorgan.com>
Co-authored-by: Dennis Huo <7410123+dennishuo@users.noreply.github.com>
Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>
Co-authored-by: olsoloviov <40199597+olsoloviov@users.noreply.github.com>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: Adnan Hemani <adnan.h@berkeley.edu>
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Build: remove code to post-process generated Quarkus jars (apache#2667)

Before Quarkus 3.28, the Quarkus generated jars used the "current" timestamp for all ZIP entries, which made the jars not-reproducible.
Since Quarkus 3.28, the generated jars use a fixed timestamp for all ZIP entries, so the custom code is no longer necessary.

This PR depends on Quarkus 3.28.

* Update docker.io/jaegertracing/all-in-one Docker tag to v1.74.0 (apache#2751)

* Updating metastore documentation with Aurora postgres example (apache#2706)

* added Aurora postgres to metastore documentation

* Service: Add events for APIs awaiting API changes (apache#2712)

* fix(enhancement): add .idea, .vscode, .venv to top level .gitignore (apache#2718)

fix(enhancement): add .idea, .vscode, .venv to top level .gitignore

* Fix javadocs of `PolarisPrincipal.getPrincipalRoles()` (apache#2752)

* fix(enhancement): squash commits (apache#2643)

* fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.14.1 (apache#2755)

* Extract interface for RequestIdGenerator (apache#2720)

Summary of changes:

1. Extracted an interface from `RequestIdGenerator`.
2. The `generateRequestId` method now returns a `Uni<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation.
3. Also addressed comments in apache#2602.

* JDBC: Handle schema evolution (apache#2714)

* Deprecate legacy management endpoints for removal (apache#2749)

* Deprecate LegacyManagementEndpoints for removal

* Add PolarisResolutionManifestCatalogView.getResolvedCatalogEntity helper (apache#2750)

this centralizes some common code and simplifies some test setups

* Enforce that S3 credentials are vended when requested (apache#2711)

This is a follow-up change to apache#2672 striving to improve user-facing error reporting for S3 storage systems without STS.

* Add property to `AccessConfig` to indicate whether the backing storage integration can produce credentials.

* Add a check to `IcebergCatalogHandler` (leading to 400) that storage credentials are vended when requested and the backend is capable of vending credentials in principle.

* Update `PolarisStorageIntegrationProviderImpl` to indicate that FILE storage does not support credential vending (requesitng redential vending with FILE storage does not produce any credentials and does not flag an error, which matches current Polaris behaviour).

* Only those S3 systems where STS is not available (or disabled / not permitted) are affected.

* Other storage integrations are not affected by this PR.

* [Catalog Federation] Ignore JIT entities when deleting federated catalogs, add integration test for namespace/table-level RBAC (apache#2690)

When enabling table/namespace level RBAC in federated catalog, JIT entities will be created during privilege grant. In the short term, we should ignore them when dropping the catalog. In the long term, we will clean-up those entities when deleting the catalog.

This will be the first step towards JIT entity clean-up:

1. Ignore JIT entities when dropping federated catalog (orphan entities)
2. Register tasks/in-place cleanup JIT entities during catalog drop
3. Add new functionality to PolarisMetastoreManager to support atomic delete non-used JIT entities during revoke.
4. Global Garbage Collector to clean-up unreachable entities (entities with non-existing catalog path/parent)

* SigV4 Auth Support for Catalog Federation - Part 3: Service Identity Info Injection (apache#2523)

This PR introduces service identity management for SigV4 Auth Support for Catalog Federation. Unlike user-supplied parameters, the service identity represents the identity of the Polaris service itself and should be managed by Polaris.

* Service Identity Injection

* Return injected service identity info in response

* Use AwsCredentialsProvider to retrieve the credentials

* Move some logic to ServiceIdentityConfiguration

* Rename ServiceIdentityRegistry to ServiceIdentityProvider

* Rename ResolvedServiceIdentity to ServiceIdentityCredential

* Simplify the logic and add more tests

* Use SecretReference and fix some small issues

* Disable Catalog Federation

* Update actions/stale digest to 5f858e3 (apache#2758)

* Service: RealmContextFilter test refactor (apache#2747)

* Update dependency software.amazon.awssdk:bom to v2.35.0 (apache#2760)

* Update apache/spark Docker tag to v3.5.7 (apache#2727)

* Update eric-maynard Team entry (apache#2763)

I'm no longer affiliated with Snowflake, so we should update this page accordingly

* Refactor resolutionManifest handling in PolarisAdminService (apache#2748)

- remove mutable `resolutionManifest` field in favor of letting the
  "authorize" methods return their `PolarisResolutionManifest`
- replace "find" helpers with "get" helpers that have built-in error
  handling

* Implement Finer Grained Operations and Privileges For Update Table (apache#2697)

This implements finer grained operations and privileges for update table in a backwards compatible way as discussed on the mailing list.

The idea is that all the existing privileges and operations will work and continue to work even after this change. (i.e. TABLE_WRITE_PROPERTIES will still ensure update table is authorized even after these changes).

However, because Polaris will now be able to identify each operation within an UpdateTable request and has a privilege model with inheritance that maps to each operation, users will now have the option of restricting permissions at a finer level if desired.

* [Python CLI][CI Failure] Pin pydantic version to < 2.12.0 to fix CI failure (apache#2770)

* Delete ServiceSecretReference (apache#2768)

* JDBC: Fix Bootstrap with schema options (apache#2762)

* Site: Add puppygraph integration (apache#2753)

* Update Changelog with finer grained authz (apache#2775)

* Add Arguments to Various Event Records (apache#2765)

* Update immutables to v2.11.5 (apache#2776)

* Client: add support for policy management (apache#2701)

Implementation for policy management via Polaris CLI (apache#1867).

Here are the subcommands to API mapping:

attach
 - PUT /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings
create
 - POST /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings
delete
 - DELETE /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}
detach
 - POST /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings
get
 - GET /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}
list
 - GET /polaris/v1/{prefix}/namespaces/{namespace}/policies 
   - This is default for `list` operation
 - GET /polaris/v1/{prefix}/applicable-policies
   - This is when we have `--applicable` option provided
update
 - PUT /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}

* Update dependency com.google.cloud:google-cloud-storage-bom to v2.58.1 (apache#2764)

* Update dependency org.jboss.weld:weld-junit5 to v5.0.3.Final (apache#2777)

* Update the LICENSE and NOTICE files in the runtime (apache#2779)

* SigV4 Auth Support for Catalog Federation - Part 4: Connection Credential Manager (apache#2759)

This PR introduces a flexible credential management system for Polaris. Building on Part 3's service identity management, this system combines Polaris service identities with user-provided authentication parameters to generate credentials for remote catalog access.
The core of this PR is the new ConnectionCredentialVendor interface, which:

Generates connection credentials by combining service identity with user auth parameters
Supports different authentication types (AWS SIGV4, AZURE Entra, GCP IAM) through CDI, currently only supports SigV4.
Provides on-demand credential generation
Enables easy extension for new authentication types
In the long term, we should move the storage credential management logic out of PolarisMetastoreManager, PolarisMetastoreManager should only provide persistence interfaces.

* Extract IcebergCatalog.getAccessConfig to a separate class AccessConfigProvider (apache#2736)

This PR extracts credential vending entrypoint getAccessConfig from IcebergCatalog into a new centralized AccessConfigProvider class, decoupling credential generation from catalog implementations.

The old SupportsCredentialVending is removed in this PR upon discussion

* Update immutables to v2.11.6 (apache#2780)

* Enhance Release docs (apache#2787)

* Spark: Remove unnecessary dependency (apache#2789)

* Update Pull Request Template (apache#2788)

* Freeze 1.2 change log (apache#2783)

* [Catalog Federation] Enable Credential Vending for Passthrough Facade Catalog (apache#2784)

This PR introduces credential vending support for passthrough-facade catalogs.

When creating a passthrough-facade catalog, the configuration currently requires two components:

StorageConfig – specifies the storage info for the remote catalog.
ConnectionInfo – defines connection parameters for the underlying remote catalog.

With this change, the StorageConfig is now also used to vend temporary credentials for user requests.
Credential vending honors table-level RBAC policies to determine whether to issue read-only or read-write credentials, ensuring access control consistency with Polaris authorization semantics.

A new test case validates the credential vending workflow, verifying both read and write credential vending.

Note: the remote catalog referenced by the passthrough-facade does not need to support IRC

* Site: Add docs for catalog federation (apache#2761)

* Python client: update CHANGELOG.MD for recent changes (apache#2796)

* Python client: remove Python 3.9 support (apache#2795)

* Update dependency software.amazon.awssdk:bom to v2.35.5 (apache#2799)

* FIX REG tests with cloud providers (apache#2793)

* [Catalog Federation] Block credential vending for remote tables outside allowed location list (apache#2791)

* Correct invalid example in management service OpenAPI spec (apache#2801)

The `example` was incorrectly placed as a sibling of `$ref` within a `schema` object in `polaris-management-service.yml`. According to the OpenAPI specification, properties that are siblings of a `$ref` are ignored.

This was causing a `NullPointerException` in OpenAPI Generator v7.13.0+ due to a change in how examples are processed. The generator now expects all `examples` to be valid and non-empty, and a misplaced `example` can lead to a null reference when the generator tries to access it (we are not yet using v7.13.0+, thus not a problem at the moment).

This commit moves the `example` to be a sibling of the `schema` object, which is the correct placement according to the OpenAPI specification.

Reference error when using newer version of openapi-generator-cli:
```
openapi-generator-cli generate -i spec/polaris-catalog-service.yaml -g python -o client/python --additional-properties=packageName=polaris.catalog --additional-properties=apiNameSuffix="" --skip-validate-spec --additional-properties=pythonVersion=3.13 --ignore-file-override /local/client/python/.openapi-generator-ignore 
...
  Exception: Cannot invoke "io.swagger.v3.oas.models.examples.Example.getValue()" because the return value of "java.util.Map.get(Object)" is null
	at org.openapitools.codegen.DefaultGenerator.processOperation(DefaultGenerator.java:1606)
	at org.openapitools.codegen.DefaultGenerator.processPaths(DefaultGenerator.java:1474)
	at org.openapitools.codegen.DefaultGenerator.generateApis(DefaultGenerator.java:663)
	at org.openapitools.codegen.DefaultGenerator.generate(DefaultGenerator.java:1296)
	at org.openapitools.codegen.cmd.Generate.execute(Generate.java:535)
	at org.openapitools.codegen.cmd.OpenApiGeneratorCommand.run(OpenApiGeneratorCommand.java:32)
	at org.openapitools.codegen.OpenAPIGenerator.main(OpenAPIGenerator.java:66)
Caused by: java.lang.NullPointerException: Cannot invoke "io.swagger.v3.oas.models.examples.Example.getValue()" because the return value of "java.util.Map.get(Object)" is null
	at org.openapitools.codegen.utils.ExamplesUtils.unaliasExamples(ExamplesUtils.java:75)
	at org.openapitools.codegen.DefaultCodegen.unaliasExamples(DefaultCodegen.java:2343)
	at org.openapitools.codegen.DefaultCodegen.fromResponse(DefaultCodegen.java:4934)
	at org.openapitools.codegen.DefaultCodegen.fromOperation(DefaultCodegen.java:4575)
	at org.openapitools.codegen.DefaultGenerator.processOperation(DefaultGenerator.java:1574)
	... 6 more
```

* Update dependency io.opentelemetry:opentelemetry-bom to v1.55.0 (apache#2804)

* Update dependency io.micrometer:micrometer-bom to v1.15.5 (apache#2806)

* Bump version for python deps (apache#2800)

* bump version for python deps

* bump version for python deps

* bump version for python deps

* Update openapi-generatr-cli from 7.11.0.post0 to 7.12.0

* Pin poetry version

* Pin poetry version

* Update dependency io.projectreactor.netty:reactor-netty-http to v1.2.11 (apache#2809)

* [Catalog Federation] Add Connection Credential Vendors for Other Auth Types (apache#2782)

Add Connection Credential Vendors for Other Auth Types

This change is a prerequisite for enabling connection credential caching.
By making PolarisCredentialManager the central entry point for obtaining connection credentials, we can introduce caching cleanly and manage all credential flows in a consistent way.

* Last merged commit 6b957ec

---------

Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: fabio-rizzo-01 <fabio.rizzocascio@jpmorgan.com>
Co-authored-by: Adnan Hemani <adnan.h@berkeley.edu>
Co-authored-by: Artur Rakhmatulin <artur.rakhmatulin@gmail.com>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Prashant Singh <35593236+singhpk234@users.noreply.github.com>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>
Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
Co-authored-by: Rulin Xing <xjdkcsq3@gmail.com>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: Travis Bowen <122238243+travis-bowen@users.noreply.github.com>
Co-authored-by: Jaz Ku <jsku@dons.usfca.edu>
Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
Co-authored-by: JB Onofré <jbonofre@apache.org>
Co-authored-by: Yufei Gu <yufei@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants