Skip to content

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Aug 3, 2025

A new getting-started docker-compose example using external authentication with Keycloak.

@adutra adutra force-pushed the external-auth-fallback branch from 03d1fc5 to 6f0445e Compare August 3, 2025 18:56
Comment on lines 27 to 28
This Keycloak realm contains 1 client definition: `client1:s3cr3t`, and 2 custom scopes: `catalog` and `sign`. It is
also configured to return tokens with the following fixed claims:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be really helpful if we can add a bit about what the custom scopes catalog / sign would mean for Polaris

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means nothing to Polaris. Scopes are between the client and Keycloak. I think I should remove those scopes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The confusion might come from the fact that when making a request to the internal deprecated iceberg oauth token endpoint we include "scope": "PRINCIPAL_ROLE:ALL" and it's not clear why - especially when we don't include it in the request to keycloak. In fact I don't think the documentation mentions at all how these role work, which to request, and why. I think there might even by a typo in the external-idp document where it sometimes says POLARIS_ROLE: instead of PRINCIPAL_ROLE:. A lot of behaviour requires delving into source code to learn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree that internal authentication looks a bit mysterious to users right now.

The internal authentication retrieves its principal roles from the OAuth2 scope field, and expects scopes in the form PRINCIPAL_ROLE:<role_name>. The scope PRINCIPAL_ROLE:ALL is a pseudo-role: it means the client is requesting all of the principal's roles in the system.

I will provide some clarification to the docs in a separate PRs. Thanks for bearing with us 😄

example.

This is obviously not a realistic configuration. In a real-world scenario, you would configure Keycloak to return the
actual principal ID, name and roles of the user. Note that principals must have been created in Polaris beforehand.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe :

Suggested change
actual principal ID, name and roles of the user. Note that principals must have been created in Polaris beforehand.
actual principal ID, name and roles of the user. Note that principals must have been created in Polaris beforehand and principal_id etc should correspond to principal info in polaris

Comment on lines +44 to +45
- `realm-mixed`: This realm is configured to use both the internal and external authentication. It accepts tokens
issued by both Polaris and Keycloak.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be minor doc about which one is checked first helpful or can link existing doc for external idp, wdyt ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it would be really helpful to put some of the information added here into the existing external-idp document, and to put a link into that existing document to this one. In particular, making it very clear in the external-idp document that principals must already exist in Polaris - I got my hopes up about full external identity control, but have since found the milestone documents about user federation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for now you need to create the users in both systems. This is admittedly impractical. We still need to implement full identity federation. I am working on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCIM support ?

@aacampbell
Copy link

This new example is very helpful, and very clearly shows how things work. One thing to note: The create_catalog.sh script doesn't work except for the first realm, because it's hardcoded to hit the deprecated iceberg token endpoint. Switching the URL to Keycloak's, and dropping the scope argument, looks like it'll work.

It does lead to an interesting train of thought. If we can expect that the deprecated iceberg catalog/v1/oauth/tokens endpoint will eventually be removed, what will be best practice? We will all have to use the external authentication type, which means complete dependence on an external system to use even the bootstrap admin user. I see there's another Issue #2238 regarding the bootstrapped admin having a hardcoded root principal name, which is what the principal-mapper must match. This would require a special-case external user in order to do any administration or even setup of other principals in Polaris. Being able to choose the bootstrapped admin's principal name would go some way to allowing uniform mapping of external IdP user attributes to principal names in Polaris, but this is getting to be more of an external concern. What isn't solved is the external dependence for doing any administrative actions. Perhaps a small/basic token endpoint in the management api for getting administrative tokens? That might prevent Polaris from doing too much of the lifting for Auth while still allowing it to be securely administered without external dependencies. I think this would effectively be the same as the current "mixed" authentication case, except the endpoint is under the management api rather than the Iceberg catalog api.

I feel like I got pretty far off topic here. Is there a more appropriate forum?

@adutra
Copy link
Contributor Author

adutra commented Aug 6, 2025

One thing to note: The create_catalog.sh script doesn't work except for the first realm, because it's hardcoded to hit the deprecated iceberg token endpoint. Switching the URL to Keycloak's, and dropping the scope argument, looks like it'll work.

With this PR, you can now pass your own token to the script.

It does lead to an interesting train of thought. If we can expect that the deprecated iceberg catalog/v1/oauth/tokens endpoint will eventually be removed, what will be best practice? We will all have to use the external authentication type, which means complete dependence on an external system to use even the bootstrap admin user. [..] I feel like I got pretty far off topic here. Is there a more appropriate forum?

Wow, there is a lot in there, thanks for sharing your thoughts.

I would recommend reading this (old) document first for getting an idea of where the problems are, especially the sections about SCIM support and persisting federated entities:

https://docs.google.com/document/d/15_3ZiRB6Lhzw0nxij341QUdxEIyFGTrI9_18bFIyJVo/edit?tab=t.0#heading=h.cu1a1acu4lc5

As you can see, all of this is still TBD and for various reasons, the work there has stalled.

I am preparing a PR to simplify roles validation and extract a few interfaces from concrete classes; this would facilitate the introduction of fully-federated principals and principal roles. But the issue with persisting vs not persisting such entities is still up for debate.

\cc @dimas-b @collado-mike @dennishuo

@adutra adutra requested a review from singhpk234 August 12, 2025 17:15
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good to me ! thank you @adutra

As final step i was trying to run the getting started my self to see if it works for me, and i am getting some failure (i commented where i see them and whats the message i see), may be its an issue in my system but would really appreciate if you can take a look ! if its not an issue please feel free to proceed !

Comment on lines +44 to +45
- `realm-mixed`: This realm is configured to use both the internal and external authentication. It accepts tokens
issued by both Polaris and Keycloak.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCIM support ?

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @adutra !

just tested the getting-started with key-cloak ! got 200 !

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 12, 2025
@singhpk234 singhpk234 merged commit 8996132 into apache:main Aug 12, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Aug 12, 2025
@adutra adutra deleted the external-auth-fallback branch August 13, 2025 07:45
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* fix(deps): update dependency io.projectreactor.netty:reactor-netty-http to v1.2.9 (apache#2326)

* Add getting-started example with external authentication (apache#2244)

* chore(deps): update quay.io/keycloak/keycloak docker tag to v26.3.2 (apache#2331)

* fix(deps): update immutables to v2.11.3 (apache#2333)

* JWTBroker: move error message (apache#2330)

This change moves the `LOGGER.error` call when a token cannot be verified from `verify()` to `generateFromToken()`.

On the token generation path, this should be a no-op; however, on the authentication path, this log message was excessive, especially when using mixed authentication since a failure to decode a token is perfectly normal when the token is from an external IDP.

* Let CI archive html test reports (apache#2327)

when having to debug CI test failures its much more convenient to be
able to download the html report compared to the XML reports (as the
latter requires to you find the right file/failure manually).

* Make S3 `roleARN` optional (apache#2329)

Fixes apache#2325

* Remove spotbugs-annotations (apache#2320)

we dont seem to be running spotbugs/findbugs in our build, so depending
on the annotations is not necessary.

also fix name of common-codec lib.

* Remove redundant locations when constructing access policies (apache#2149)

Iceberg tables can technically store data across any number of paths, but Polaris currently uses 3 different locations for credential vending:
1. The table's base location
2. The table's `write.data.path`, if set
3. The table's `write.metadata.path`, if set

This was intended to capture scenarios where e.g. (2) is not a child path of (1), so that the vended credentials can still be valid for reading the entire table. However, there are systems that seem to always set (2) and (3), such as:

1. `s3:/my-bucket/base/iceberg`
2. `s3:/my-bucket/base/iceberg/data`
3. `s3:/my-bucket/base/iceberg/metadata`

In such cases the extra paths (e.g. extra resources in the AWS Policy) are redundant. In one such case, these redundant paths caused the policy to exceed the maximum allowable 2048 characters.

This PR removes redundant paths -- those that are the child of another path -- from the list of accessible locations tracked for a given table and does some slight refactoring to consolidate the logic for extracting these paths from a TableMetadata.

* Remove CallContext from IcebergPropertiesValidation (apache#2338)

it is sufficient to pass the `RealmConfig`.
same applies to helpers in `PolarisEndpoints`.

* Add entitySubType param to BasePersistence.listEntities (apache#2317)

`BasePersistence.listEntities` has 3 variants:
```
Page<EntityNameLookupRecord> listEntities(..., PageToken);

Page<EntityNameLookupRecord> listEntities(..., Predicate<PolarisBaseEntity>, PageToken)

<T> Page<T> listEntities(..., Predicate<PolarisBaseEntity>, Function<PolarisBaseEntity, T>, PageToken);
```

the 1st method exists to only return the subset of entity properties required to build an `EntityNameLookupRecord`.

the 3rd method supports a predicate and transformer function on the underlying `PolarisBaseEntity`, which means it has to load all entity properties.

the 2nd method is weird as it supports a full `Predicate<PolarisBaseEntity>`, which means it has to load all entity properties under the hood for filtering but then throws most of them away to return a `EntityNameLookupRecord`.
this explains why the implementations of the 2nd method simply forward to the 3rd method usually.
any performance benefits of returning a `EntityNameLookupRecord` are lost.

as it turns out the 2nd method is only used, because methods 1 and 3 dont support passing a `PolarisEntitySubType` parameter to filter down the retrieved data.
Note that the sub type property is available from both the `PolarisBaseEntity` as well as the `EntityNameLookupRecord`.

By adding this parameter, the 2nd method can go away completely.
we can even push down the sub type filtering into the queries of some of our persistence implementations.
other existing implementations are free to decide whether they want to push it down as well or filter on the query results in memory.

note that since we have no `TransactionalPersistence` implementation in the codebase that provides an optimized variant of method 1 we can have a default method in the interface that forwards to method 3.

* Add PyIceberg example (apache#2315)

It is not obvious how to connect PyIceberg to a Polaris catalog.

This PR clears that up by providing an example in the getting-started section of the documentation.

* fix(docs): fix some broken url. (apache#2335)

* fix(docs): fix entity doc API links. (apache#2316)

* fix(deps): update dependency io.netty:netty-codec-http2 to v4.2.4.final (apache#2342)

* NoSQL: Misc ports

* Adopt to the state of apache#2131 (OSS NoSQL PR / idgen)
* Track "base locations" and use an index to detect conflicts (via PolarisMetaStoreManager.hasOverlappingSiblings). Feature must be enabled in the Polaris config. Implementation prepared for intentional overlaps. Backwards compatible, except for checks against already existing tables.
* Cosmetic changes (bunch of)

* Some more adoptions from OSS

... based on a `git diff` against the OSS `persistence-nosql` PR branch.

* Last merged commit 4c23eb7

---------

Co-authored-by: Mend Renovate <bot@renovateapp.com>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Christopher Lambert <xn137@gmx.de>
Co-authored-by: Eric Maynard <eric.maynard+oss@snowflake.com>
Co-authored-by: Frederic Khayat <61949371+FredKhayat@users.noreply.github.com>
Co-authored-by: Yujiang Zhong <42907416+zhongyujiang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants