Skip to content

Conversation

@adam-christian-software
Copy link
Contributor

@adam-christian-software adam-christian-software commented Jul 18, 2025

Motivation

This implementation starts the series of PRs to implement the NoSQL work presented in https://docs.google.com/document/d/1POUWe0xMZOBoaJ6Rgiw35ziEoc6OEYCiW7Zk6bR9H6M/edit?tab=t.0#heading=h.nx9vzhg2x8v2 with the first implementation here.

Description

This PR is dedicated to creating the ID Generation Framework.

Related to #650 & #844


public interface SnowflakeIdGenerator extends IdGenerator {
/** Offset of the snowflake ID generator since the 1970-01-01T00:00:00Z epoch instant. */
Instant EPOCH_OFFSET =
Copy link
Contributor

@dimas-b dimas-b Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is not technically an offset, but an exact moment in time. How about just EPOCH?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the expression is unnecessarily complex, the below one is 100% equivalent and imho easier to grasp:

Instant.parse("2025-03-01T00:00:00Z")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b - At the risk of sounding too verbose, how about ID_GENERATOR_EPOCH? I'd like to distinguish this from Unix Epoch?

And, I agree @adutra . I'll update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID_GENERATOR_EPOCH LGTM. Alt: ID_EPOCH.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like ID_EPOCH. I'll do that.

long generateId();

/** Generate the system ID for a node, solely used by/for node management purposes. */
long systemIdForNode(int nodeId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ID expected to be the same for all IdGenerator implementations? What if a future impl. is not node-based?

Should this be pushed down to the Snowflake ID code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't specific to the particular implementation.
See the upcoming node-id-lease stuff: there is one implementation with one constant configuration per setup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the more that I read about the node-id lease stuff. I don't know whether this should be in this module. Here's my thinking:

  1. It seems that NodeID generation is special and not a Snowflake ID generation. The implementation is only based upon the node id passed in and some system-wide variables such as SnowflakeIdGeneratorImpl#timestampMax, SnowflakeIdGeneratorImpl#timestampShift, & SnowflakeIdGeneratorImpl#sequenceBits. So, in practice, it's really just the Node ID passed in.
  2. So, given that, I think we could pull this out and just put it into the node leasing modules. That way, we can keep the IdGenerator clean for the cases that require the distributed id generation.

What do y'all think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Adam's point 1.

... however, I have a bigger concern. Suppose we run with Snowflake IDs for a while and then change to another ID generator. Assume generateId() outputs do not clash. Still, do we expect systemIdForNode(X) to return the same value for all generator implementations and for all possible values of X?

compileOnly(platform(libs.quarkus.bom))
compileOnly("io.quarkus:quarkus-core")

compileOnly(project(":polaris-immutables"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency seems unused. I wonder if you didn't forget to annotate IdGeneratorSource with @PolarisImmutable? It seems like a good candidate for that (lots of anonymous classes implementing this interface).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IdGeneratorSource in this PR matches https://github.com/snazy/polaris/blob/persistence-nosql/persistence/nosql/idgen/spi/src/main/java/org/apache/polaris/ids/spi/IdGeneratorSource.java

I believe the plan is to address node ID concerns in the ID generator after we merge the whole NoSQL code. I'll file an issue for that.

I'll remove the unused dep on polaris-immutables for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override
void close();

void waitUntilTimeMillisAdvanced();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be good to add javadocs here (and above), it's not immediately clear what this method is supposed to do (spin-wait until the clock ticks?).

Also neither this method nor sleepMillis throw InterruptedException, which is surprising since they are clearly blocking. It could be good to add an @implSpec note about how the interrupt flag is expected to be handled.

Map<String, String> params();

@PolarisImmutable
interface BuildableIdGeneratorSpec extends IdGeneratorSpec {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class looks odd? Afaict we could use ImmutableIdGeneratorSpec.builder() instead. The only difference seems to be that there is a default type here, whereas in IdGeneratorSpec there isn't, but it's certainly possible to overcome this limitation somehow.


long timestampFromId(long id);

long timestampUtcFromId(long id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Timestamp UTC" does not make sense to me 🤔
From the implementation class, it seems you meant "timestamp since Unix Epoch" instead.


long constructId(long timestamp, long sequence, long node);

long timestampFromId(long id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is understood as "since the Snowflake Epoch (2025-03-01)". Could be good to add javadocs to clarify.


public interface SnowflakeIdGenerator extends IdGenerator {
/** Offset of the snowflake ID generator since the 1970-01-01T00:00:00Z epoch instant. */
Instant EPOCH_OFFSET =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the expression is unnecessarily complex, the below one is 100% equivalent and imho easier to grasp:

Instant.parse("2025-03-01T00:00:00Z")


@Override
public long currentTimeMillis() {
return SnowflakeIdGenerator.EPOCH_OFFSET_MILLIS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks suspicious as it is returning a fixed timestamp. Did you mean clockMillis.getAsLong()?

Suggested change
return SnowflakeIdGenerator.EPOCH_OFFSET_MILLIS;
return clockMillis.getAsLong();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think that there are some special-casing that is happening and is leaking into our interfaces and implementation. See here as well: https://github.com/apache/polaris/pull/2131/files#r2214779625

We need the Snowflake ID Generation for Commit IDs and Entity Object IDs.

We need a Node ID. Node IDs cannot be Snowflake IDs because Snowflake IDs require a NodeID.

So, I actually think that we need a NodeIdGenerator concept for Node IDs and the concepts presented here for the other IDs. Right now, we are blending them into one interface and I think that is making things a bit confusing.

For example:

  1. IdGeneratorFactory#buildSystemIdGenerator
  2. IdGenerator#systemIdForNode

I think that we probably need to separate these concepts.

What do y'all think, @dimas-b & @snazy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think node ID generation should be internal to the Snowflake ID code. A sharable NodeIdGenerator generator might be an overkill, but that is a secondary concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial comment stems from the fact that the clockMillis parameter is part of the method signature (thus part of the API) but is not used, so I don't know why it is there.

}

@Override
public long constructId(long timestamp, long sequence, long nodeId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method effectively bypasses the Id generator source. Maybe it shouldn't be public? Having two methods in the same API that generate IDs using different mechanisms seems error-prone. OTOH this method seems only called from timeUuidToId, in this same class, so it seems it could be private.

public long generateId() {
var nodeId = idGeneratorSource.nodeId();
checkState(nodeId >= 0, "Cannot generate a new ID, shutting down?");
var nodeIdPattern = ((long) nodeId) << sequenceBits;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move this variable declaration to line 228. It's not needed before.

@dimas-b dimas-b requested a review from dennishuo July 21, 2025 20:32
@dimas-b dimas-b marked this pull request as ready for review August 13, 2025 13:17
@dimas-b dimas-b changed the title WIP feat(idgen): Start Implementation of NoSQL with the ID Generation Framework feat(idgen): Start Implementation of NoSQL with the ID Generation Framework Aug 13, 2025
@dimas-b
Copy link
Contributor

dimas-b commented Aug 13, 2025

I propose to merge this PR "as is" and address comments after we get the whole NoSQL persistence merged.

I'll deal with merge conflicts presently.

snazy
snazy previously approved these changes Aug 14, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 14, 2025
@dimas-b
Copy link
Contributor

dimas-b commented Aug 19, 2025

updated to the latest main and resolved conflicts.

dimas-b
dimas-b previously approved these changes Aug 19, 2025
@dimas-b
Copy link
Contributor

dimas-b commented Aug 19, 2025

If no objections, I propose to merge and open issue for follow-up on the topics discovered during the review.

snazy
snazy previously approved these changes Aug 20, 2025
adutra
adutra previously approved these changes Aug 20, 2025
@dimas-b dimas-b dismissed stale reviews from adutra and snazy via 5b14b90 September 2, 2025 16:16
@dimas-b
Copy link
Contributor

dimas-b commented Sep 2, 2025

Resolved conflicts in gradle/libs.versions.toml

adutra
adutra previously approved these changes Sep 3, 2025
Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I think my previous remark still holds though:

https://github.com/apache/polaris/pull/2131/files#r2215576536

@dimas-b dimas-b merged commit c783de9 into apache:main Sep 3, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Sep 3, 2025
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* fix(deps): update dependency io.projectreactor.netty:reactor-netty-http to v1.2.9 (apache#2326)

* Add getting-started example with external authentication (apache#2244)

* chore(deps): update quay.io/keycloak/keycloak docker tag to v26.3.2 (apache#2331)

* fix(deps): update immutables to v2.11.3 (apache#2333)

* JWTBroker: move error message (apache#2330)

This change moves the `LOGGER.error` call when a token cannot be verified from `verify()` to `generateFromToken()`.

On the token generation path, this should be a no-op; however, on the authentication path, this log message was excessive, especially when using mixed authentication since a failure to decode a token is perfectly normal when the token is from an external IDP.

* Let CI archive html test reports (apache#2327)

when having to debug CI test failures its much more convenient to be
able to download the html report compared to the XML reports (as the
latter requires to you find the right file/failure manually).

* Make S3 `roleARN` optional (apache#2329)

Fixes apache#2325

* Remove spotbugs-annotations (apache#2320)

we dont seem to be running spotbugs/findbugs in our build, so depending
on the annotations is not necessary.

also fix name of common-codec lib.

* Remove redundant locations when constructing access policies (apache#2149)

Iceberg tables can technically store data across any number of paths, but Polaris currently uses 3 different locations for credential vending:
1. The table's base location
2. The table's `write.data.path`, if set
3. The table's `write.metadata.path`, if set

This was intended to capture scenarios where e.g. (2) is not a child path of (1), so that the vended credentials can still be valid for reading the entire table. However, there are systems that seem to always set (2) and (3), such as:

1. `s3:/my-bucket/base/iceberg`
2. `s3:/my-bucket/base/iceberg/data`
3. `s3:/my-bucket/base/iceberg/metadata`

In such cases the extra paths (e.g. extra resources in the AWS Policy) are redundant. In one such case, these redundant paths caused the policy to exceed the maximum allowable 2048 characters.

This PR removes redundant paths -- those that are the child of another path -- from the list of accessible locations tracked for a given table and does some slight refactoring to consolidate the logic for extracting these paths from a TableMetadata.

* Remove CallContext from IcebergPropertiesValidation (apache#2338)

it is sufficient to pass the `RealmConfig`.
same applies to helpers in `PolarisEndpoints`.

* Add entitySubType param to BasePersistence.listEntities (apache#2317)

`BasePersistence.listEntities` has 3 variants:
```
Page<EntityNameLookupRecord> listEntities(..., PageToken);

Page<EntityNameLookupRecord> listEntities(..., Predicate<PolarisBaseEntity>, PageToken)

<T> Page<T> listEntities(..., Predicate<PolarisBaseEntity>, Function<PolarisBaseEntity, T>, PageToken);
```

the 1st method exists to only return the subset of entity properties required to build an `EntityNameLookupRecord`.

the 3rd method supports a predicate and transformer function on the underlying `PolarisBaseEntity`, which means it has to load all entity properties.

the 2nd method is weird as it supports a full `Predicate<PolarisBaseEntity>`, which means it has to load all entity properties under the hood for filtering but then throws most of them away to return a `EntityNameLookupRecord`.
this explains why the implementations of the 2nd method simply forward to the 3rd method usually.
any performance benefits of returning a `EntityNameLookupRecord` are lost.

as it turns out the 2nd method is only used, because methods 1 and 3 dont support passing a `PolarisEntitySubType` parameter to filter down the retrieved data.
Note that the sub type property is available from both the `PolarisBaseEntity` as well as the `EntityNameLookupRecord`.

By adding this parameter, the 2nd method can go away completely.
we can even push down the sub type filtering into the queries of some of our persistence implementations.
other existing implementations are free to decide whether they want to push it down as well or filter on the query results in memory.

note that since we have no `TransactionalPersistence` implementation in the codebase that provides an optimized variant of method 1 we can have a default method in the interface that forwards to method 3.

* Add PyIceberg example (apache#2315)

It is not obvious how to connect PyIceberg to a Polaris catalog.

This PR clears that up by providing an example in the getting-started section of the documentation.

* fix(docs): fix some broken url. (apache#2335)

* fix(docs): fix entity doc API links. (apache#2316)

* fix(deps): update dependency io.netty:netty-codec-http2 to v4.2.4.final (apache#2342)

* NoSQL: Misc ports

* Adopt to the state of apache#2131 (OSS NoSQL PR / idgen)
* Track "base locations" and use an index to detect conflicts (via PolarisMetaStoreManager.hasOverlappingSiblings). Feature must be enabled in the Polaris config. Implementation prepared for intentional overlaps. Backwards compatible, except for checks against already existing tables.
* Cosmetic changes (bunch of)

* Some more adoptions from OSS

... based on a `git diff` against the OSS `persistence-nosql` PR branch.

* Last merged commit 4c23eb7

---------

Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: Frederic Khayat <61949371+FredKhayat@users.noreply.github.com>
Co-authored-by: Yujiang Zhong <42907416+zhongyujiang@users.noreply.github.com>
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Integration tests for Catalog Federation (apache#2344)

Adds a Junit5 integration test for catalog federation.

* Fix merge conflict in CatalogFederationIntegrationTest (apache#2420)

apache#2344 added a new test for catalog federation, but it looks like an undetected conflict with concurrent changes related to authentication have broken the test in main.

* chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.23-6.1755674729 (apache#2416)

* 2334 (apache#2427)

* Fix TableIdentifier in TaskFileIOSupplier (apache#2304)

we cant just convert a `TaskEntity` to a `IcebergTableLikeEntity` as the
`getTableIdentifier()` method will not return a correct value by using
the name of the task and its parent namespace (which is empty?).

task handlers instead need to pass in the `TableIdentifier` that they
already inferred via `TaskEntity.readData`.

* Fix NPE in CreateCatalog (apache#2435)

* Doc fix: Access control page update (apache#2424)

* 2418

* 2418

* fix(deps): update dependency software.amazon.awssdk:bom to v2.32.29 (apache#2443)

* Optimize PolicyCatalog.listPolicies (apache#2370)

this is a follow-up to apache#2290

the optimization is to use `listEntities` instead of `loadEntities` when
there is `policyType` filter to apply

* Add PolarisDiagnostics field to BaseMetaStoreManager (apache#2381)

* Add PolarisDiagnostics field to BaseMetaStoreManager

the ultimate goal is removing the `PolarisCallContext` parameter from every
`PolarisMetaStoreManager` interface method, so we make steps towards
reducing its usage first.

* Add feature flag to disallow custom S3 endpoints (apache#2442)

* Add new realm-level flag: `ALLOW_SETTING_S3_ENDPOINTS` (default: true)

* Enforce in `PolarisServiceImpl.validateStorageConfig()`

Fixes apache#2436

* Deprecate ActiveRolesProvider for removal (apache#2404)

* Client: fix openapi verbose output, remove doc generate, and skip test generations (apache#2439)

* Fix various issue in client code generation

* Use logger instead of print

* Add back exclude on __pycache__ as CI is not via Makefile

* Add back exclude on __pycache__ as CI is not via Makefile

* Add user principal tag in metrics (apache#2445)

* Added API change to enable tag

* Added test

* Added production readiness check

* fix(deps): update dependency io.opentelemetry.semconv:opentelemetry-semconv to v1.36.0 (apache#2454)

* fix(deps): update dependency com.google.cloud:google-cloud-storage-bom to v2.56.0 (apache#2447)

* fix(deps): update dependency gradle.plugin.org.jetbrains.gradle.plugin.idea-ext:gradle-idea-ext to v1.3 (apache#2428)

* Build: Make jandex dependency used for index generation managed (apache#2431)

Also allows specifying the jandex index version for the build.

This is a preparation step contributing to apache#2204, once a jandex fix for reproducible builds is available.

Co-authored-by: Alexandre Dutra <adutra@apache.org>

* Built: improve reproducible archive files (apache#2432)

As part of the effort for apache#2204, this change fixes a few aspects around reproducible builds:

Some Gradle projects produce archive files, but don't get the necessary Gradle archive-tasks settings applied: one not-published project but also the tarball&zip of the distribution. This change moves the logic to the new build-plugin `polaris-reproducible`.

Another change is to have some Quarkus generated jar files adhere to the same conventions, which are constant timestamps for the zip entries and a deterministic order of the entries. That's sadly not a full fix, as the classes that are generated or instumented by Quarkus differ in each build.

Contributes to apache#2204

* Remove commons-lang3 dependency (apache#2456)

outside of tests we can replace the functionality with jdk11 and guava.
also stop using `org.assertj.core.util` as its a non-public api.

* add refresh credentials property to loadTableResult (apache#2341)

* add refresh credentials property to loadTableResult

* IcebergCatalogAdapterTest: Added test to ensure refresh credentials endpoint is included

* delegate refresh credential endpoint configuration to storage integration

* GCP: Add refresh credential properties

* fix(deps): update dependency io.opentelemetry.semconv:opentelemetry-semconv to v1.37.0 (apache#2458)

* Add Delegator to all API Implementations (apache#2434)

Per the Dev ML, implements the Delegator pattern to add Events instrumentation to all Polaris APIs.

* Prefer java.util.Base64 over commons-codec (apache#2463)

`java.util.Base64` is available since java8 and we are already using it
in a few other spots.

in a follow-up we might be able to get rid of our `commons-codec` dependency
completely.

* Service: Move tests to the right package (apache#2469)

* Update versions in runtime LICENSE and NOTICE (apache#2468)

* fix(deps): update dependency com.adobe.testing:s3mock-testcontainers to v4.8.0 (apache#2475)

* fix(deps): update dependency com.gradleup.shadow:shadow-gradle-plugin to v9.1.0 (apache#2476)

* Service: Remove hadoop-common from polaris-runtime-service (apache#2462)

* Service: Always validate allowed locations from Storage Config (apache#2473)

* Add Community Sync Meeting 20250828 (apache#2477)

* Update dependency software.amazon.awssdk:bom to v2.33.0 (apache#2483)

* Remove PolarisCallContext.getDiagServices (apache#2415)

* Remove PolarisCallContext.getDiagServices usage

* Remove diagnostics from PolarisCallContext

* Feature: Expose resetCredentials via a new reset api to allow root user to reset credentials for an existing principal with custom values  (apache#2197)

* Add type-check to PolarisEntity subclass ctors (apache#2302)

currently one can freely "cast" any `PolarisEntity` to a more
specific type via their constructors.

this can lead to subtle bugs like we fixed in
a29f800

by adding type checks we discover a few more places where we need to be
more careful about how we construct new or handle existing entities.

note that we can add a check for `PolarisEntitySubType` in a followup,
but it requires more fixes currently.

* Fix CI (apache#2489)

Fix undetected merge conflict after apache#2197 + apache#2415 + apache#2434

* Use local diagnostics in TransactionWorkspaceMetaStoreManager

* Add resetCredentials to PolarisPrincipalsEventServiceDelegator

* Core: Prevent AIOOBE for negative codes in PolarisEntityType, PolarisPrivilege, ReturnStatus (apache#2490)

* feat(idgen): Start Implementation of NoSQL with the ID Generation Framework (apache#2131)

Create an ID Generation Framework.

Related to apache#650 & apache#844

Co-authored-by: Robert Stupp <snazy@snazy.de>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>

* perf(refactor): optimizing JdbcBasePersistenceImpl.listEntities (apache#2465)

- Reduced Column Selection: Only 6 columns instead of 16

- Eliminated Object Creation Overhead: Direct conversion to EntityNameLookupRecord without intermediate PolarisBaseEntity

* Add Polaris Events to Persistence (apache#1844)

* AWS CloudWatch Event Sink Implementation (apache#1965)

* Fix failing CI (apache#2498)

* Update actions/stale digest to 3a9db7e (apache#2499)

* Core: Prevent AIOOBE for negative policy codes in PredefinedPolicyType (apache#2486)

* Service: Add location tests for views (apache#2496)

* Update docker.io/jaegertracing/all-in-one Docker tag to v1.73.0 (apache#2500)

* Update dependency io.netty:netty-codec-http2 to v4.2.5.Final (apache#2495)

* Update actions/setup-python action to v6 (apache#2502)

* Update the Release Guide about the Helm Chart package (apache#2179)

* Update the Release Guide about the Helm Chart package

* Update release-guide.md

Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr>

* Add missing commit message

* Whitespace

* Use Helm GPG plugin to sign the Helm chart

* Fix directories during Helm chart copy to SVN

* Add Helm index to SVN

* Use long name for svn checkout

* Ensure the Helm index is updated after the chart is moved to SVN dist release

* Do not publish any Docker image before the vote succeeds

* Typos

* Revert "Do not publish any Docker image before the vote succeeds"

This reverts commit 5617e65.

* Don't mention Helm values.yaml in the release guide as it doesn't contain version details

---------

Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr>

* Update dependency com.azure:azure-sdk-bom to v1.2.38 (apache#2503)

* Update registry.access.redhat.com/ubi9/openjdk-21-runtime Docker tag to v1.23-6.1756793420 (apache#2504)

* Remove commons-codec dependency (apache#2474)

follow-up to f8ad77a

we can simply use guava instead and eliminate the extra dependency

* CLI: Remove SCRIPT_DIR and default config location to user home (apache#2448)

* Remove readInternalProperties helpers (apache#2506)

the functionality is already provided by the `PrincipalEntity`

* Add Events for Generic Table APIs (apache#2481)


This PR adds the Events instrumentation for the Generic Tables Service APIs, surrounding the default delegated call to the business logic APIs.

* Disable custom namespace locations (apache#2422)

When we create a namespace or alter its location, we must confirm that this location is within the parent location. This PR introduces introduces a check similar to the one we have for tables, where custom locations are prohibited by default. This functionality is gated behind a new behavior change flag `ALLOW_NAMESPACE_CUSTOM_LOCATION`. In addition to allowing us to revert to the old behavior, this flag allows some tests relying on arbitrarily-located namespaces to pass (such as those from upstream Iceberg).

Fixes: apache#2417

* fix for IcebergAllowedLocationTest (apache#2511)

* Remove unused config from SparkSessionBuilder (apache#2512)

Tests pass without it.

* Add Events for Policy Service APIs (apache#2479)

* Remove PolarisTestMetaStoreManager.jsonNode helper (apache#2513)

* Update dependency software.amazon.awssdk:bom to v2.33.4 (apache#2517)

* Update dependency com.nimbusds:nimbus-jose-jwt to v10.5 (apache#2514)

* Update dependency io.opentelemetry:opentelemetry-bom to v1.54.0 (apache#2515)

* Update dependency io.micrometer:micrometer-bom to v1.15.4 (apache#2519)

* Port missed OSS change

* NoSQL: adopt to updated test packages

* NoSQL: adapt to removed PolarisDiagnostics param

* NoSQL: fix libs.versions.toml

* NoSQL: include jandex plugin related changes from OSS

* NoSQL: changes for delete/set principal client-ID+secret

* Last merged commit c6176dc

---------

Co-authored-by: Pooja Nilangekar <poojan@umd.edu>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Honah (Jonas) J. <honahx@apache.org>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: fivetran-kostaszoumpatianos <kostas.zoumpatianos@fivetran.com>
Co-authored-by: Jason <jasonf20@gmail.com>
Co-authored-by: Adnan Hemani <adnan.h@berkeley.edu>
Co-authored-by: Yufei Gu <yufei@apache.org>
Co-authored-by: JB Onofré <jbonofre@apache.org>
Co-authored-by: fivetran-arunsuri <103934371+fivetran-arunsuri@users.noreply.github.com>
Co-authored-by: Adam Christian <105929021+adam-christian-software@users.noreply.github.com>
Co-authored-by: Artur Rakhmatulin <artur.rakhmatulin@gmail.com>
Co-authored-by: Pierre Laporte <pierre@pingtimeout.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants