-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] An ability to specify OIDC issuer url (guid or even user-provided-string) deterministically during cluster creation time #3982
Comments
This is a must-have for multi-tenancy, where the cluster lifecycle is typically controlled by a platform team, but the federated credentials and managed identities are controlled by users/developer teams. Without a static/predictable/recoverable OIDC issuer URL, if the platform team needs to recreate the cluster for any reason, the OIDC issuer URL would get rotated and cause a breaking change for users' workload identity federations. |
Thanks for letting us know your feedback and user scenario. There is security risk for BYO (bring your own) OIDC Issuer url. |
Thanks for attention @CocoWang-wql. I don't think we necessarily need the ability to BYO issuer URL at cluster creation, as long as all federated credentials "find their way back" after a cluster recreation. I suppose there could be a couple of angles to approach this from, here's a few:
|
Thanks for the info. Would like to know more details: |
The problem is not with the Pod or Service Account. The problem is that the User-assigned Managed Identity to which the Service Account is federated via a
Here's the scenario.
The practical result of this is that an AKS cluster must be treated like a "pet" that can never be recreated. Because if we do, we cause a breaking change for all users/developers in the sense that all workload identity federations stop working and we need to call every developer/user and ask them to update their |
The pain point is having to re-establish federated identity credential with updated OIDC url for all services. Imagine running 100+ services (with distinct MSI per each) in a cluster and having to update OIDC url for each. |
Also feeling the pain of this issue. We have to coordinate the recreation of all managed identity federations. If there was a way to programmatically find all federated credentials for a cluster (by tag name or something), then we could automate this, but currently we'd have to search through all managed identities to find matches. I guess we could experiment with storing this URL behind some reverse proxy, but that seems like a lot of experimentation for something that might not work. |
+1 on this issue, this makes it a huge pain for platform teams that need to replace clusters. There should be a way to have a 'static' endpoint so we do not need to update the federations on all off the identities |
+1 on the issue. we don't have control on downstream configuration which add complexity and dependency. |
I have now paused migration to workload identity as this would make DR so much harder. Will stick with aad pod identity until resolved. |
The inability to share OIDC issuer URIs for clusters is a pain point for workload identity adoption. We do blue/green kubernetes cluster deployments to avoid potential issues during infrastructure updates. E.g:
The problem is that every update cycle; the new cluster would get a new issuer URI and we'd have to keep track of and re-create every federation again (actually keep two instances because both clusters are online at the same time). This is something we have solved for our self hosted clusters where we can bring our own static issuer. Having an issuer as a separate object in azure would be great, and the ability to optionally specify one when creating/recreating a cluster. In our case the two clusters would simply point at the same issuer. In this scenario it doesn't matter what the URI is as long as it's static. The complexity of creating and rotating keys could also be abstracted away from the user. EDIT: Reading this again I realized what I suggested above is exactly what @illrill suggested, I missed that somehow :) |
+1 for this A possible workaround would be to use Terraform to destroy/create an AKS cluster, and then a Terraform apply to update the identities based on the new cluster OIDC issuer URL. It would be great not to have to do this. |
that's a potential workaround but a bad one, as called out above, my team manages the "platform" and we have 100's of services deployed on the cluster, we don't manage their identities and in many cases can't even see them. We've engaged with Microsoft professional services (how I was linked to this thread) as we have a similar issue as to what's mentioned, our DR strategy is to replace the cluster if something goes catastrophically wrong, we also utilize our own on-prem clusters which allow us to manage our own JWKS/oidc endpoints, which is not plausible with Azure since we have no read/write access to the service signing key nor cluster configuration at that level. We recently went through trials with GKE and with their fleet (what used to be Anthos) there's a single endpoint that multiple clusters take on, that was part of the question request as well as we run many clusters per environment (effectively one federated credential for "prod" instead of one per cluster). |
Will this be tracked/resolved by #2861? |
@CocoWang-wql any updates in this topic? |
Is your feature request related to a problem? Please describe.
Right now, when an AKS cluster is created with OIDC issuer enabled, OIDC issuer url is generated randomly such as:
https://westus2.oic.prod-aks.azure.com/[tenantId]/[random-guid]/
This poses a maintenance problem when we need to delete and recreate a cluster, because we need to ask all deployed services to update their identities to update their federated credentials with new randomly generated OIDC urls.
Describe the solution you'd like
An ability to specify an OIDC issuer guid deterministically during cluster creation time such as:
This will allow us to keep the same OIDC issuer url even when clusters are destroyed and recreated, and allow such cluster recreation process transparent to deployed services without having to ask them to update federated credentials.
The text was updated successfully, but these errors were encountered: