Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEDA components don't reload certificates #5055

Open
JorTurFer opened this issue Oct 5, 2023 · 5 comments
Open

KEDA components don't reload certificates #5055

JorTurFer opened this issue Oct 5, 2023 · 5 comments
Labels
bug Something isn't working stale-bot-ignore All issues that should not be automatically closed by our stale bot

Comments

@JorTurFer
Copy link
Member

JorTurFer commented Oct 5, 2023

Report

Currently, we have some processes relying on generated certificates, such as the metrics server and the communication channel between MS and the operator. If the certificate changes, we must ensure that it's reloaded in all the components. For example, if a user switch from operator-managed certificate to cert-manager, the metrics server won't be restarted because there are no changes on it, but the exposed certificate by the operator has changed.

This can applies also on CA rotation, etc

@JorTurFer JorTurFer added the bug Something isn't working label Oct 5, 2023
@SpiritZhou
Copy link
Contributor

Is using push notifications from the operator to notify other servers to reload certificates a good solution?

@JorTurFer
Copy link
Member Author

I think that just a watch on the file is a better option, because the operator doesn't manage the certificates always. I mean, if certificates are managed externally, I'd like to reload them when there are changes on the file system (IIRC, mounted secrets are updated when the secret changes).
We have to take into account different scenarios:

  • Internal communications (KEDA components):
    • Operator gRPC server: It should watch the cert and restart the server with the new certificate
    • Metrics server gRPC client: It can monitor the cert, but I'd just read the certificate on every connection try. If we restart operator's gRPC server, this will close the connection and reading the certificate from file in that moment is okey IMO.
    • Metrics server metrics endpoint: We should validate how it works with upstream library (custom-metrics-apiserver) to ensure if it does the hot reload or not. If not, we need to figure out how to do it.
    • Admission webhooks: Same as with metrics endpoint, we need to figure out how to do it with control manager (operator-sdk/kubebuilder)
  • External communications:
    • This only applies to the operator as it's the only who establishes external communications. In this case, we should update the stored rootCAs, probably regenerating the collection on changes on any certificate from the custom folder path. Already created clients in scalers cache, will still use the old certificate until the client fails. I'd not change this behavior, I mean, if the scaler is already created with a client using the old rootCAs, that's okey because when the old certificate expires, the upstream will reject it, enforcing the scaler recreation using this time the new rootCAs.

Maybe I'm missing something, what do you thing @zroubalik ?

@zroubalik
Copy link
Member

I agree, that using a file watch is the best solution.

I think that there's already existing functionality for that in Metrics Server library. Not 100% at the moment.

@SpiritZhou
Copy link
Contributor

Great! using a file watch is the better one.

Copy link

stale bot commented Dec 16, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Dec 16, 2023
@JorTurFer JorTurFer added the stale-bot-ignore All issues that should not be automatically closed by our stale bot label Dec 16, 2023
@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale-bot-ignore All issues that should not be automatically closed by our stale bot
Projects
Status: To Do
Development

No branches or pull requests

3 participants