-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure Keyvault access stops working correctly with Kustomize-controller >=0.22.0 #595
Comments
Thank you for your extensive report. The However, your observations appear logical to me, which means something about the above apparently is not true. |
The other change between |
Yes @hiddeco , just tested with:
and
I can confirm that decryption with AzureKV-backed keys works fine with sops 3.7.2, too, at least in my case. |
Think I found the issue, will prepare a patch. |
Can you please give the following (AMD64) image a try? |
Nope, it is still in error :-(
|
Are you sure this information is from after you rolled out the above image? As based on the patch I created, I am quite sure this was the mistake, and the patch must resolve the issue. Latest image, fully up-to-date with current state of PR: |
Now available as an official release in https://github.com/fluxcd/flux2/releases/tag/v0.28.3 |
Hi @hiddeco , thank you for your quick intervention. Unfortunately that did not solve the issue, which I still have both for your image, hosted on Anyway, while I'm sure that handling in a better manner the fallback to the default server for Azure KV, as you did on this patch, I don't think this can I help in my case. Incidentally, both fallback and secretRef-provided auth information are the same in my case (we're not ready to use different keys for various kustomizations, yet we are aiming for it in the future). So, using fallback or secretRef-provided (kustomization-specific) auth wouldn't make any difference, at least in my case. I'm sorry for not having specified this clearly beforehand. Moreover, I did already check that appropriate token was retrieved by Azure AD from sops. The problem surfaces later: for some reason, Azure API (azkeys) sends bad api_version in REST call. This appears beyond scope of kustomize-controller, which is merely a user of that SDK by means of embedded sops module. Can you think about something bad happening there, instead of in kustomize-controller? Thanks! |
However, @hiddeco, I can confirm that, if secretRef is removed, fallback works perfectly right now with kustomize-controller 0.22.2, so your patch works in that respect. This is a workaround allowing me to use recent kustomization versions in my case, as long as I do not need to use different secrets/keys/key vaults in different kustomizations. I can afford not using that at this point in development, so thank you! The problem remains with kustomizations using both secretRef and AzureKV, which, as you stated before, uses a different SDK, where the problems still lies. |
Can confirm that kustomize-controller crashes as soon as it tries to reconcile a Kustomization that uses
I'm also using |
In my case, this release is broken. |
@BertelBB can you share more details about your SOPS configuration and your Azure setup so I can replicate your precise setup? |
@log2 for the issues you described in #595 (comment), I expect https://github.com/fluxcd/kustomize-controller/releases/tag/v0.22.3 to now work without issues. @BertelBB this might solve your issues as well. If not, please provide me with the output of |
@hiddeco I installed the latest version of gotk on one of my clusters this morning and the kustomize-controller functioned properly while decrypting sops secrets. I guess it is using v0.22.3, I didn’t check the release notes. I don’t have access to my setup but I can give you a high level overview. aad-pod-identity using the latest official Helm chart with an identity authorized to use a key in Azure Key Vault. kustomize-controller deployed with gotk@v0.28.3 and patched to bind with Azure Identity and use MSI authentication. Kustomization object configured with Downgrading gotk to an older version resolves the issue. |
Ok I looked into it and my dev cluster that is running the latest version of gotk (v0.28.4) and has I can try replicating the setup that was failing and give you a more detailed report if you want |
@BertelBB can you first confirm that the cluster that was misbehaving didn't accidentally have kustomize-controller I am investigating another issue which I am experiencing in a replicated environment like yours, where the MSI authentication flow of SOPS seems to become unresponsive. However, this also seems to happen for anything in the |
This appears to just take a very long time in the versions I tried out, eventually ending up with an error like:
|
That means the cause of the issue was the broken fallback, which was restored in I will be making some structural changes to the code so this can not happen again, while adding some more tests for other KMS solutions to ensure regression bugs (or compatibility issues with upstream) will be detected earlier, as already done for Azure Key Vault in #604. Given all this, I think this issue can be closed, but do not hesitate to comment (or open a new issue) if new problems and/or questions arise. Thanks all 🙇 |
Manual patches to overwrite the source-controller version can be removed when updating to https://github.com/fluxcd/flux2/releases/tag/v0.28.5. This should solve all reported problems in this issue (and the general |
When using Azure KV with sops-stored secrets in kustomize controller >= 0.22.0 a strange behaviour appears: decryption is rejected due to a bad parameter (HTTP Status code 400) error, with no further indication about what's wrong.
In fact, using Azure Monitor Key Vault functionality and enabling detailed logging (even for failures) I was able to pinpoint the error.
REST API address generated by internal SOPS in kustomize controller is:
https://my-keyvault-name.vault.azure.net/keys/sops-key/1d....my-key-version.../decrypt?api-version=7.3-preview
Instead of
https://my-keyvault-name.vault.azure.net/keys/sops-key/1d....my-key-version.../decrypt?api-version=2016-10-01
Which is generated when I use sops in macOS CLI.
api-version
is surely strange, because it does not resembles any API version scheme used in Azure (right?), but it is close enough to the latest version number in Azure SDK for Go, so may be it is a bug on a recent version of that library.So I started searching history of go.mod and testing recent versions of kustomize-controller that were in use before bumping Flux.
0.22.0 and 0.22.1 exhibit the very same failure, some maybe what bricked Azure KV is this version (0.4.0) of azkeys library, introduced in 0.22.0
kustomize-controller 0.21.1 works just fine, instead.
The text was updated successfully, but these errors were encountered: