Extract interface for RequestIdGenerator #2720

adutra · 2025-09-30T13:15:18Z

Summary of changes:

Extracted an interface from RequestIdGenerator.
The generateRequestId method now returns a Uni<String> in case custom implementations need to perform I/O or other blocking calls during request ID generation.
Also addressed comments in Generate Request IDs (if not specified); Return Request ID as a Header #2602.

Summary of changes: 1. Extracted an interface from `RequestIdGenerator`. 2. The `generateRequestId` method now returns a `Uni<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation. 3. Also addressed comments in apache#2602.

adutra · 2025-09-30T13:19:27Z

cc @adnanhemani

adnanhemani · 2025-09-30T20:05:56Z

runtime/service/src/main/java/org/apache/polaris/service/tracing/RequestIdFilter.java

+            : requestIdGenerator.generateRequestId(rc))
+        .onItem()
+        .invoke(id -> rc.setProperty(REQUEST_ID_KEY, id))
+        .invoke(id -> ContextLocals.put(REQUEST_ID_KEY, id))


What is this for?

Same as we do in RealmContextFilter: this places the request ID in the current Vertx context for later retrieval by another event loop thread:

https://quarkus.io/guides/duplicated-context#context-local-data

IIUC, ContextLocals seems a global container to hold anything could be used within one request context. Another option is to use a explicit RequestScoped bean to store these, like we did in the call context bean. In that case, it's a more typed solution. How would you compare both approaches?

In a slightly orthogonal direction: I don't see us using the variables we've already stored in ContextLocals anywhere. So while I see that we are putting the information in RealmContextFilter, what is our (proposed?) use of this?

Another thought (I'm not sure if this actually answers the question) - if we were to use the request ID later downstream in the application (for example, Event Listeners), would it be better to use ContextLocals or ContainerRequestContext?

How would you compare both approaches?

Indeed, a request-scoped bean is undoubtedly better.

So while I see that we are putting the information in RealmContextFilter, what is our (proposed?) use of this?

This was done a while ago, as I thought at that time that we would be leveraging context propagation to "transfer" some information from the initial request scope to, say, the async tasks framework.

It seems though that, since that time, we've been taking the opposite direction: that of avoiding context propagation (due to some issues with bean proxies, etc.)

The situation today is that those calls to ContextLocals are probably not necessary anymore. They can be removed.

Are we OK if I remove the call to ContextLocals in RequestIdFilter in this PR, then remove the other call in RealmContextFilter in a different PR?

FYI: #2747 (removed ContextLocals and seized the opportunity for some minor cleanup).

UPDATE: I actually remember now 😄

ContextLocals is required in this specific place:

polaris/runtime/service/src/main/java/org/apache/polaris/service/metrics/RealmIdTagContributor.java

Line 41 in 20febda

context.requestContextLocalData(RealmContextFilter.REALM_CONTEXT_KEY);

I even remember asking the Quarkus devs for this feature:

quarkusio/quarkus#47887

So, TLDR: we need to keep ContextLocals in RealmContextFilter. As for RequestIdFilter we can remove it, at least for now.

flyrain · 2025-09-30T20:33:32Z

runtime/service/src/main/java/org/apache/polaris/service/tracing/RequestIdGenerator.java

+   * Generates a new request ID. IDs must be fast to generate and unique. If the generation involves
+   * I/O (which is not recommended), it should be performed asynchronously.


Can we remove these comments, if I/O involving is not recommended?

Suggested change

* Generates a new request ID. IDs must be fast to generate and unique. If the generation involves

* I/O (which is not recommended), it should be performed asynchronously.

* Generates a new request ID. IDs must be fast to generate and unique.

flyrain

Thanks @adutra for the PR — it looks good to me overall!
Would you mind elaborating a bit on the motivation behind it?
In what situation would we need a custom request ID generator?

adutra · 2025-10-01T13:54:55Z

Would you mind elaborating a bit on the motivation behind it?

I'm trying to avoid what happened with the RealmContextFilter a while ago: #1345.

IOW, we want to make sure that whatever technique is used to generate a request ID, we won't see Quarkus complaining about it being blocking.

In what situation would we need a custom request ID generator?

I could see e.g. users storing their request IDs in the database, or using Zookeeper's distributed atomic counter.

dimas-b · 2025-10-01T15:00:45Z

runtime/service/src/main/java/org/apache/polaris/service/tracing/DefaultRequestIdGenerator.java

+    @Override
+    @Nonnull
+    public String toString() {
+      return String.format("%s_%019d", uuid(), counter());


nit: if lexicographic ordering is desired (based on previous conversations), it might be worth using something like <MILLIS_SINCE_EPOCH_PADDED>_<COUNTER_PADDED>_<UUID> WDYT?

or <MILLIS_PADDED>_<SMALL_COUNTER>_<UUID> - SMALL_COUNTER meaning something in the range of 10K to avoid calling the system clock too often.

but this might be getting too much into the monotonic clock area 😅

I'm not sure lexicographic sort order is desirable for anything else than making it easier for humans to visualize and compare request IDs – so any of your suggestions is fine 😄
How about we get this in without changing the generation logic, and then, we look into ways of making the logic better?

Absolutely 👍

Personally, I think it's fine if the requests are only lexicographically sorted across all requests made per node. IMO request ID sorting doesn't mean much for API clients as they also usually log timestamps.

I'm okay to change it, if you both feel there's something better - but just my two cents 😃

adnanhemani · 2025-10-02T09:15:58Z

In what situation would we need a custom request ID generator?

I could see e.g. users storing their request IDs in the database, or using Zookeeper's distributed atomic counter.

To be clear, I am mostly fine with the changes and asking mainly to learn here - isn't the expected workflow that, in the case users want to store their request IDs in a database prior to the API call, that they would store their request IDs on client-side and pass it into Polaris via the request headers? Can you please explain further in what use case the server-side would actually want to store (and then use) pre-computed request IDs?

I think ZK's distributed atomic counter would be a good use case where making request ID asynchronous would help, although I'm sure most developers nowadays would prefer not to use ZK for a use case like creating request IDs unless they have no other option :)

adutra · 2025-10-02T10:08:48Z

FYI. One thing that annoys me a bit but I didn't want to change it without prior discussion:

The classes RequestIdFilter, RequestIdGenerator and RequestIdResponseFilter are all in the org.apache.polaris.service.tracing package – but the configuration they use is declared in the org.apache.polaris.service.logging package and the configuration option is prefixed with polaris.log.

I find this setup a little confusing, especially for developers.

I can provide a "fix" for that later if you all agree.

Summary of changes: 1. Remove the call to ContextLocals (context: apache#2720 (comment)). 2. Don't include the exception's message in the response as it can leak details about Polaris internals. 3. Add a small test for success and failure cases.

flyrain

Thanks @adutra !

* Build: remove code to post-process generated Quarkus jars (apache#2667) Before Quarkus 3.28, the Quarkus generated jars used the "current" timestamp for all ZIP entries, which made the jars not-reproducible. Since Quarkus 3.28, the generated jars use a fixed timestamp for all ZIP entries, so the custom code is no longer necessary. This PR depends on Quarkus 3.28. * Update docker.io/jaegertracing/all-in-one Docker tag to v1.74.0 (apache#2751) * Updating metastore documentation with Aurora postgres example (apache#2706) * added Aurora postgres to metastore documentation * Service: Add events for APIs awaiting API changes (apache#2712) * fix(enhancement): add .idea, .vscode, .venv to top level .gitignore (apache#2718) fix(enhancement): add .idea, .vscode, .venv to top level .gitignore * Fix javadocs of `PolarisPrincipal.getPrincipalRoles()` (apache#2752) * fix(enhancement): squash commits (apache#2643) * fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.14.1 (apache#2755) * Extract interface for RequestIdGenerator (apache#2720) Summary of changes: 1. Extracted an interface from `RequestIdGenerator`. 2. The `generateRequestId` method now returns a `Uni<String>` in case custom implementations need to perform I/O or other blocking calls during request ID generation. 3. Also addressed comments in apache#2602. * JDBC: Handle schema evolution (apache#2714) * Deprecate legacy management endpoints for removal (apache#2749) * Deprecate LegacyManagementEndpoints for removal * Add PolarisResolutionManifestCatalogView.getResolvedCatalogEntity helper (apache#2750) this centralizes some common code and simplifies some test setups * Enforce that S3 credentials are vended when requested (apache#2711) This is a follow-up change to apache#2672 striving to improve user-facing error reporting for S3 storage systems without STS. * Add property to `AccessConfig` to indicate whether the backing storage integration can produce credentials. * Add a check to `IcebergCatalogHandler` (leading to 400) that storage credentials are vended when requested and the backend is capable of vending credentials in principle. * Update `PolarisStorageIntegrationProviderImpl` to indicate that FILE storage does not support credential vending (requesitng redential vending with FILE storage does not produce any credentials and does not flag an error, which matches current Polaris behaviour). * Only those S3 systems where STS is not available (or disabled / not permitted) are affected. * Other storage integrations are not affected by this PR. * [Catalog Federation] Ignore JIT entities when deleting federated catalogs, add integration test for namespace/table-level RBAC (apache#2690) When enabling table/namespace level RBAC in federated catalog, JIT entities will be created during privilege grant. In the short term, we should ignore them when dropping the catalog. In the long term, we will clean-up those entities when deleting the catalog. This will be the first step towards JIT entity clean-up: 1. Ignore JIT entities when dropping federated catalog (orphan entities) 2. Register tasks/in-place cleanup JIT entities during catalog drop 3. Add new functionality to PolarisMetastoreManager to support atomic delete non-used JIT entities during revoke. 4. Global Garbage Collector to clean-up unreachable entities (entities with non-existing catalog path/parent) * SigV4 Auth Support for Catalog Federation - Part 3: Service Identity Info Injection (apache#2523) This PR introduces service identity management for SigV4 Auth Support for Catalog Federation. Unlike user-supplied parameters, the service identity represents the identity of the Polaris service itself and should be managed by Polaris. * Service Identity Injection * Return injected service identity info in response * Use AwsCredentialsProvider to retrieve the credentials * Move some logic to ServiceIdentityConfiguration * Rename ServiceIdentityRegistry to ServiceIdentityProvider * Rename ResolvedServiceIdentity to ServiceIdentityCredential * Simplify the logic and add more tests * Use SecretReference and fix some small issues * Disable Catalog Federation * Update actions/stale digest to 5f858e3 (apache#2758) * Service: RealmContextFilter test refactor (apache#2747) * Update dependency software.amazon.awssdk:bom to v2.35.0 (apache#2760) * Update apache/spark Docker tag to v3.5.7 (apache#2727) * Update eric-maynard Team entry (apache#2763) I'm no longer affiliated with Snowflake, so we should update this page accordingly * Refactor resolutionManifest handling in PolarisAdminService (apache#2748) - remove mutable `resolutionManifest` field in favor of letting the "authorize" methods return their `PolarisResolutionManifest` - replace "find" helpers with "get" helpers that have built-in error handling * Implement Finer Grained Operations and Privileges For Update Table (apache#2697) This implements finer grained operations and privileges for update table in a backwards compatible way as discussed on the mailing list. The idea is that all the existing privileges and operations will work and continue to work even after this change. (i.e. TABLE_WRITE_PROPERTIES will still ensure update table is authorized even after these changes). However, because Polaris will now be able to identify each operation within an UpdateTable request and has a privilege model with inheritance that maps to each operation, users will now have the option of restricting permissions at a finer level if desired. * [Python CLI][CI Failure] Pin pydantic version to < 2.12.0 to fix CI failure (apache#2770) * Delete ServiceSecretReference (apache#2768) * JDBC: Fix Bootstrap with schema options (apache#2762) * Site: Add puppygraph integration (apache#2753) * Update Changelog with finer grained authz (apache#2775) * Add Arguments to Various Event Records (apache#2765) * Update immutables to v2.11.5 (apache#2776) * Client: add support for policy management (apache#2701) Implementation for policy management via Polaris CLI (apache#1867). Here are the subcommands to API mapping: attach - PUT /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings create - POST /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings delete - DELETE /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name} detach - POST /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings get - GET /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name} list - GET /polaris/v1/{prefix}/namespaces/{namespace}/policies - This is default for `list` operation - GET /polaris/v1/{prefix}/applicable-policies - This is when we have `--applicable` option provided update - PUT /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name} * Update dependency com.google.cloud:google-cloud-storage-bom to v2.58.1 (apache#2764) * Update dependency org.jboss.weld:weld-junit5 to v5.0.3.Final (apache#2777) * Update the LICENSE and NOTICE files in the runtime (apache#2779) * SigV4 Auth Support for Catalog Federation - Part 4: Connection Credential Manager (apache#2759) This PR introduces a flexible credential management system for Polaris. Building on Part 3's service identity management, this system combines Polaris service identities with user-provided authentication parameters to generate credentials for remote catalog access. The core of this PR is the new ConnectionCredentialVendor interface, which: Generates connection credentials by combining service identity with user auth parameters Supports different authentication types (AWS SIGV4, AZURE Entra, GCP IAM) through CDI, currently only supports SigV4. Provides on-demand credential generation Enables easy extension for new authentication types In the long term, we should move the storage credential management logic out of PolarisMetastoreManager, PolarisMetastoreManager should only provide persistence interfaces. * Extract IcebergCatalog.getAccessConfig to a separate class AccessConfigProvider (apache#2736) This PR extracts credential vending entrypoint getAccessConfig from IcebergCatalog into a new centralized AccessConfigProvider class, decoupling credential generation from catalog implementations. The old SupportsCredentialVending is removed in this PR upon discussion * Update immutables to v2.11.6 (apache#2780) * Enhance Release docs (apache#2787) * Spark: Remove unnecessary dependency (apache#2789) * Update Pull Request Template (apache#2788) * Freeze 1.2 change log (apache#2783) * [Catalog Federation] Enable Credential Vending for Passthrough Facade Catalog (apache#2784) This PR introduces credential vending support for passthrough-facade catalogs. When creating a passthrough-facade catalog, the configuration currently requires two components: StorageConfig – specifies the storage info for the remote catalog. ConnectionInfo – defines connection parameters for the underlying remote catalog. With this change, the StorageConfig is now also used to vend temporary credentials for user requests. Credential vending honors table-level RBAC policies to determine whether to issue read-only or read-write credentials, ensuring access control consistency with Polaris authorization semantics. A new test case validates the credential vending workflow, verifying both read and write credential vending. Note: the remote catalog referenced by the passthrough-facade does not need to support IRC * Site: Add docs for catalog federation (apache#2761) * Python client: update CHANGELOG.MD for recent changes (apache#2796) * Python client: remove Python 3.9 support (apache#2795) * Update dependency software.amazon.awssdk:bom to v2.35.5 (apache#2799) * FIX REG tests with cloud providers (apache#2793) * [Catalog Federation] Block credential vending for remote tables outside allowed location list (apache#2791) * Correct invalid example in management service OpenAPI spec (apache#2801) The `example` was incorrectly placed as a sibling of `$ref` within a `schema` object in `polaris-management-service.yml`. According to the OpenAPI specification, properties that are siblings of a `$ref` are ignored. This was causing a `NullPointerException` in OpenAPI Generator v7.13.0+ due to a change in how examples are processed. The generator now expects all `examples` to be valid and non-empty, and a misplaced `example` can lead to a null reference when the generator tries to access it (we are not yet using v7.13.0+, thus not a problem at the moment). This commit moves the `example` to be a sibling of the `schema` object, which is the correct placement according to the OpenAPI specification. Reference error when using newer version of openapi-generator-cli: ``` openapi-generator-cli generate -i spec/polaris-catalog-service.yaml -g python -o client/python --additional-properties=packageName=polaris.catalog --additional-properties=apiNameSuffix="" --skip-validate-spec --additional-properties=pythonVersion=3.13 --ignore-file-override /local/client/python/.openapi-generator-ignore ... Exception: Cannot invoke "io.swagger.v3.oas.models.examples.Example.getValue()" because the return value of "java.util.Map.get(Object)" is null at org.openapitools.codegen.DefaultGenerator.processOperation(DefaultGenerator.java:1606) at org.openapitools.codegen.DefaultGenerator.processPaths(DefaultGenerator.java:1474) at org.openapitools.codegen.DefaultGenerator.generateApis(DefaultGenerator.java:663) at org.openapitools.codegen.DefaultGenerator.generate(DefaultGenerator.java:1296) at org.openapitools.codegen.cmd.Generate.execute(Generate.java:535) at org.openapitools.codegen.cmd.OpenApiGeneratorCommand.run(OpenApiGeneratorCommand.java:32) at org.openapitools.codegen.OpenAPIGenerator.main(OpenAPIGenerator.java:66) Caused by: java.lang.NullPointerException: Cannot invoke "io.swagger.v3.oas.models.examples.Example.getValue()" because the return value of "java.util.Map.get(Object)" is null at org.openapitools.codegen.utils.ExamplesUtils.unaliasExamples(ExamplesUtils.java:75) at org.openapitools.codegen.DefaultCodegen.unaliasExamples(DefaultCodegen.java:2343) at org.openapitools.codegen.DefaultCodegen.fromResponse(DefaultCodegen.java:4934) at org.openapitools.codegen.DefaultCodegen.fromOperation(DefaultCodegen.java:4575) at org.openapitools.codegen.DefaultGenerator.processOperation(DefaultGenerator.java:1574) ... 6 more ``` * Update dependency io.opentelemetry:opentelemetry-bom to v1.55.0 (apache#2804) * Update dependency io.micrometer:micrometer-bom to v1.15.5 (apache#2806) * Bump version for python deps (apache#2800) * bump version for python deps * bump version for python deps * bump version for python deps * Update openapi-generatr-cli from 7.11.0.post0 to 7.12.0 * Pin poetry version * Pin poetry version * Update dependency io.projectreactor.netty:reactor-netty-http to v1.2.11 (apache#2809) * [Catalog Federation] Add Connection Credential Vendors for Other Auth Types (apache#2782) Add Connection Credential Vendors for Other Auth Types This change is a prerequisite for enabling connection credential caching. By making PolarisCredentialManager the central entry point for obtaining connection credentials, we can introduce caching cleanly and manage all credential flows in a consistent way. * Last merged commit 6b957ec --------- Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: fabio-rizzo-01 <fabio.rizzocascio@jpmorgan.com> Co-authored-by: Adnan Hemani <adnan.h@berkeley.edu> Co-authored-by: Artur Rakhmatulin <artur.rakhmatulin@gmail.com> Co-authored-by: Alexandre Dutra <adutra@apache.org> Co-authored-by: Prashant Singh <35593236+singhpk234@users.noreply.github.com> Co-authored-by: Christopher Lambert <xn137@gmx.de> Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com> Co-authored-by: Honah (Jonas) J. <honahx@apache.org> Co-authored-by: Rulin Xing <xjdkcsq3@gmail.com> Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com> Co-authored-by: Travis Bowen <122238243+travis-bowen@users.noreply.github.com> Co-authored-by: Jaz Ku <jsku@dons.usfca.edu> Co-authored-by: Yong Zheng <yongzheng0809@gmail.com> Co-authored-by: JB Onofré <jbonofre@apache.org> Co-authored-by: Yufei Gu <yufei@apache.org>

github-project-automation bot added this to Basic Kanban Board Sep 30, 2025

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Sep 30, 2025

adutra force-pushed the request-id-generator-interface branch from 8f5f2b6 to 6e58c97 Compare September 30, 2025 13:18

adnanhemani reviewed Sep 30, 2025

View reviewed changes

flyrain reviewed Sep 30, 2025

View reviewed changes

review

b44573f

dimas-b previously approved these changes Oct 1, 2025

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Oct 1, 2025

Remove ContextLocals

3221913

adutra dismissed dimas-b’s stale review via 3221913 October 2, 2025 09:56

adutra mentioned this pull request Oct 2, 2025

Cleanup RealmContextFilter #2747

Merged

dimas-b approved these changes Oct 2, 2025

View reviewed changes

flyrain approved these changes Oct 2, 2025

View reviewed changes

sfc-gh-ahemani approved these changes Oct 2, 2025

View reviewed changes

adnanhemani approved these changes Oct 2, 2025

View reviewed changes

adutra merged commit d2a607a into apache:main Oct 3, 2025
16 checks passed

github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Oct 3, 2025

adutra deleted the request-id-generator-interface branch October 3, 2025 14:46

adutra mentioned this pull request Oct 3, 2025

Rename request ID to correlation ID #2757

Closed

		* Generates a new request ID. IDs must be fast to generate and unique. If the generation involves
		* I/O (which is not recommended), it should be performed asynchronously.

Extract interface for RequestIdGenerator #2720

Extract interface for RequestIdGenerator #2720

Uh oh!

Conversation

adutra commented Sep 30, 2025

Uh oh!

adutra commented Sep 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain left a comment

Choose a reason for hiding this comment

Uh oh!

adutra commented Oct 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adnanhemani commented Oct 2, 2025

Uh oh!

adutra commented Oct 2, 2025

Uh oh!

flyrain left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dimas-b Oct 1, 2025 •

edited

Loading