Skip to content

Conversation

@dimas-b
Copy link
Contributor

@dimas-b dimas-b commented Aug 16, 2025

With tables, the client can decide whether to purge the table on drop or not. However, Polaris Servers used to unconditionally perform the purge on dropping a view.

After #1619 that behaviour effectively prevents dropping views if the admin user does not set DROP_WITH_PURGE_ENABLED. The latter, though, is not currently advisable per #1617.

This change introduces a new feature configuration (PURGE_VIEWS_ON_DROP) that allows the admin user to instruct Polaris servers to drop views without purging to achieve operational parity with tables.

Fixes #2367

Related dev ML discussion: https://lists.apache.org/thread/12cd8dyh2g0ntn664rp3d1p5z5f9ts9y

@dimas-b
Copy link
Contributor Author

dimas-b commented Aug 16, 2025

(made the branch at the main repo accidentally 🤦 will delete it when the PR is done)

eric-maynard
eric-maynard previously approved these changes Aug 18, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 18, 2025
eric-maynard
eric-maynard previously approved these changes Aug 18, 2025
eric-maynard added a commit that referenced this pull request Aug 18, 2025
In #2369 Iceberg table federation was refactored around the new `IcebergRESTExternalCatalogFactory` type based on discussion in the community sync. This has unblocked the ability to federate to more non-Iceberg catalogs, such as in #2355. This PR refactors generic table federation to go through the same mechanism. After this, we can go through and implement generic table federation for the existing `IcebergRESTExternalCatalogFactory` implementations.
@dimas-b dimas-b force-pushed the view-drop-no-purge branch from 85dbcbf to ccd5214 Compare August 19, 2025 18:55
@dimas-b
Copy link
Contributor Author

dimas-b commented Aug 19, 2025

@eric-maynard : I renamed the feature flag to PURGE_VIEW_METADATA_ON_DROP per email discussion. Since there were no strong opinions, I'm open to going back to the previous name or any other name you suggest :)

https://lists.apache.org/thread/k94o7k0f6d2cqgqx24v8j21z01ho34sl

eric-maynard
eric-maynard previously approved these changes Aug 19, 2025
Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM!

bmlyr pushed a commit to bmlyr/polaris that referenced this pull request Aug 19, 2025
In apache#2369 Iceberg table federation was refactored around the new `IcebergRESTExternalCatalogFactory` type based on discussion in the community sync. This has unblocked the ability to federate to more non-Iceberg catalogs, such as in apache#2355. This PR refactors generic table federation to go through the same mechanism. After this, we can go through and implement generic table federation for the existing `IcebergRESTExternalCatalogFactory` implementations.
snazy
snazy previously approved these changes Aug 20, 2025
With tables, the client can decide whether to purge the table
on drop or not. However, Polaris Servers used to unconditionally
perform the purge on dropping a view.

After #1619 that behaviour effectively prevents dropping views
if the admin user does not set `DROP_WITH_PURGE_ENABLED`. The
latter, though, is not currently advisable per #1617.

This change introduces a new feature configuration
(`PURGE_VIEWS_ON_DROP`) that allows the admin user to instruct
Polaris servers to drop views without purging to achieve
operational parity with tables.

Fixes #2367
@dimas-b
Copy link
Contributor Author

dimas-b commented Aug 20, 2025

rebased and resolved conflicts

@dimas-b dimas-b merged commit c97b150 into main Aug 20, 2025
12 checks passed
@dimas-b dimas-b deleted the view-drop-no-purge branch August 20, 2025 15:38
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Aug 20, 2025
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Refactor Authenticator and PolarisPrincipal (apache#2307)

The main goal of this change is to facilitate future integration of federated principals:

- `AuthenticatedPolarisPrincipal` becomes an interface `PolarisPrincipal`, as the original class leaks implementation details (references to `PrincipalEntity` and thus to the storage layer). The new interface does not reference the storage layer. This is one step further towards easy pluggability of authentication in Polaris.

- The `Authenticator.authenticate()` method does not return an `Optional` anymore, as this was ambiguous (returning `Optional.empty()` vs throwing `NotAuthorizedException`).

- Also the `Authenticator` interface is not generic anymore. This was an artifact of times when there were two kinds of `Authenticators` in Polaris (one for internal auth, the other for external) and is not necessary anymore.

* Add PolarisDiagnostics field to TransactionalMetaStoreManagerImpl (apache#2361)

the ultimate goal is removing the PolarisCallContext parameter from every
PolarisMetaStoreManager interface method, so we make steps towards reducing
its usage first.

* Support HMS Federation (apache#2355)

Supports federating to HiveCatalog using the Iceberg REST library. 
All hive dependencies are added in an independent module, i.e., `polaris-extensions-federation-hive` and can be removed/converted to a compile time flag if necessary. 
Similar to HadoopCatalog, HMS federation support is currently restricted to `IMPLICIT` auth. The underlying authentication can be any form that Hive supports, however Polaris will not store and manage any of these credentials. Again, similar to HadoopCatalog, this version supports federating to a single Hive instance. 

This PR relies on Polaris discovering the `hive-site.xml` file to get the configuration options from the classpath (including `HADOOP_CONF_DIR`).

The spec change has been discussed in the [dev mailing list](https://lists.apache.org/thread/5qktjv6rzd8pghcl6f4oohko798o2p2g), followed by a discussion in the Polaris community sync on Aug 7, 2025. 

Testing: 
Modified the regression test to locally test that Hive federation works as expected. The next step would be to add a regression test once the change is baked into the Polaris docker image (for CI builds).


This PR primarily builds on apache#1305 and apache#1466. Thank you @dennishuo and @eric-maynard for helping out with this!

* Add PolarisDiagnostics field to TransactionWorkspaceMetaStoreManager (apache#2359)

the ultimate goal is removing the `PolarisCallContext` parameter from every
`PolarisMetaStoreManager` interface method, so we make steps towards
reducing its usage first.

* Rat-ignore user-settings for hugo-run-in-docker (apache#2376)

* Modularize generic table federation (apache#2379)

In apache#2369 Iceberg table federation was refactored around the new `IcebergRESTExternalCatalogFactory` type based on discussion in the community sync. This has unblocked the ability to federate to more non-Iceberg catalogs, such as in apache#2355. This PR refactors generic table federation to go through the same mechanism. After this, we can go through and implement generic table federation for the existing `IcebergRESTExternalCatalogFactory` implementations.

* Update community meeting dates (apache#2382)

* Reduce getRealmConfig calls (apache#2337)

Classes with a `CallContext` field should call `getRealmConfig` once and store it as a field as well.
The idea is that long term we would want to stop relying on the `CallContext` itself but instead inject its individual items. Thus we also add `RealmConfig` to `TestServices`.

* Python client: make S3 role-ARN optional and add missing endpoint-internal property (apache#2339)

* fix(deps): update dependency io.prometheus:prometheus-metrics-exporter-servlet-jakarta to v1.4.1 (apache#2377)

* chore(deps): bump s3mock from 3.11.0 to 4.7.0 (apache#2375)

Updates S3Mock testcontainer dependency from 3.11.0 to 4.7.0 and refactors usage into a centralized wrapper class in runtime/test-common.

Changes

    Upgraded S3Mock testcontainer to 4.7.0
    Created S3Mock wrapper class for consistent configuration
    Consolidated S3 config properties generation
    Updated integration tests to use new wrapper

No functional changes to test behavior.

* Nit: extract getResolvedCatalogEntity method in IcebergCatalogHandler (apache#2387)

* Nit: remove transitive dependencies from runtime/server/build.gradle.kts (apache#2385)

* Nit: add methods isExternal and isStaticFacade to CatalogEntity (apache#2386)

* Minor refactor of integration test classes (apache#2384)

This change promotes `CatalogConfig` and `RestCatalogConfig` to top-level, public annotations and introduces a few "hooks" in `PolarisRestCatalogIntegrationBase` that can be overridden by subclasses.

This change is a preparatory work for apache#2280 (S3 remote signing).

* Remove BaseMetaStoreManager.serializeProperties (apache#2374)

similar to 7af85be we should prefer
the existing helper methods on the entity instead

* fix: minor corrections of documentation (apache#2397)

- fixed dead link to catalog definition in Iceberg docs on Entities page
- removed single quotes from credential parameter in the cmdline example for connecting a local spark-sql: env variables need to be resolved in cmdline, they will not be resolved by spark-sql itself.

* chore(deps): update azure/setup-helm action to v4.3.1 (apache#2402)

* Add 1.0.1 release to the website (apache#2400)

* Add PolarisDiagnostics field to AbstractTransactionalPersistence (apache#2372)

The ultimate goal is removing the `PolarisCallContext` parameter from every `PolarisMetaStoreManager` interface method, so we make steps towards reducing its usage first.

* NoSQL: javadoc nit

* Last merged commit fcd4777

---------

Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Pooja Nilangekar <poojan@umd.edu>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: JB Onofré <jbonofre@apache.org>
Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Artur Rakhmatulin <from_github@binaryc.at>
Co-authored-by: olsoloviov <40199597+olsoloviov@users.noreply.github.com>
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* feat: Add Pod Disruption Budget support to Helm chart (apache#2380)

* chore(deps): update quay.io/keycloak/keycloak docker tag to v26.3.3 (apache#2407)

* Mention Helm chart support for PodDisruptionBudget in CHANGELOG.md (apache#2408)

* chore: Suppress javac deprecation warnings in SparkCatalog (apache#2394)

SparkCatalog intentionally overrides and uses deprecated
methods from Spark's TableCatalog.

This PR adds suppression annotations to allow for clean
compilation given that the deprecated method calls and
overrides are clearly expected in this case.

* Python client auto generate (apache#2192)

* Python client auto generate

* Python client auto generate

* Python client auto generate

* Python client auto generate

* Python client auto generate

* Python client auto generate

* Remove auto generated doc

* undo

* Fix doc

* Fix docker ref from CONTAINER_TOOL to DOCKER

* Add client help manual to GH action

* Add missing region to MinIO getting-started example (apache#2411)

The example was missing an AWS region, thus causing Spark to fail with:

```
spark-sql ()> create table ns.t1 as select 'abc';
25/08/20 16:25:06 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
software.amazon.awssdk.core.exception.SdkClientException: Unable to load region from any of the providers in the chain software.amazon.awssdk.regions.providers.DefaultAwsRegionProviderChain@47578c86: [software.amazon.awssdk.regions.providers.SystemSettingsRegionProvider@1656f847: Unable to load region from system settings. Region must be specified either via environment variable (AWS_REGION) or  system property (aws.region)., software.amazon.awssdk.regions.providers.AwsProfileRegionProvider@2bbaabe3: No region provided in profile: default, software.amazon.awssdk.regions.providers.InstanceProfileRegionProvider@54b1cfd8: Unable to contact EC2 metadata service.]
...
        at org.apache.iceberg.aws.AwsClientFactories$DefaultAwsClientFactory.s3(AwsClientFactories.java:119)
        at org.apache.iceberg.aws.s3.S3FileIO.client(S3FileIO.java:391)
        at org.apache.iceberg.aws.s3.S3FileIO.newOutputFile(S3FileIO.java:193)
```

* Add feature config to allow dropping views without purging (apache#2369)

* Add feature config to allow dropping views without purging

With tables, the client can decide whether to purge the table
on drop or not. However, Polaris Servers used to unconditionally
perform the purge on dropping a view.

After apache#1619 that behaviour effectively prevents dropping views
if the admin user does not set `DROP_WITH_PURGE_ENABLED`. The
latter, though, is not currently advisable per apache#1617.

This change introduces a new feature configuration
(`PURGE_VIEWS_ON_DROP`) that allows the admin user to instruct
Polaris servers to drop views without purging to achieve
operational parity with tables.

Fixes apache#2367

* review: rename to PURGE_VIEW_METADATA_ON_DROP

* review: re-fix description

* Last merged commit c97b150

---------

Co-authored-by: Bryan Maloyer <bryan.mlyr@gmail.com>
Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Dmitri Bourlatchkov <dmitri.bourlatchkov@gmail.com>
Co-authored-by: Yong Zheng <yongzheng0809@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to drop view with default server configuration

3 participants