Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there established precedent for handling cert rotation? #33

Closed
dgoodwin opened this issue Apr 27, 2020 · 6 comments
Closed

Is there established precedent for handling cert rotation? #33

dgoodwin opened this issue Apr 27, 2020 · 6 comments

Comments

@dgoodwin
Copy link
Contributor

We have problems with our webhooks starting to fail when cert rotation happens. We tried file monitoring the mounted certs and terminating the process if we see a change but even this is not working reliably and we end up with pods still stuck.

It does not seem there's any way to inject healthz checks, or to trap this kind of error. Should we look at adding a health check to this library which monitors for cert auth failures?

Or should we look to an external solution outside our apiserver process, (i.e. our operator) which catches the cert change and forces a redeploy?

cc @deads2k @sttts

@dgoodwin
Copy link
Contributor Author

Discussing with my team today, we're wondering if you all would be open to a patch that implements watching for cert auth failures in this library and reflects that in a health check, at which point we let kube decide what to do with us.

@sttts
Copy link
Contributor

sttts commented Apr 28, 2020

k8s.io/apiserver supports reload of the serving certs now. It just needs a rebase of this repo to 1.18. Am open for a PR doing that.

@dgoodwin
Copy link
Contributor Author

Ok that's even better, thanks! We will vendor and try that out asap.

@dgoodwin
Copy link
Contributor Author

Would this require that the control plane also be 1.18? Or is it purely local to our apiserver process?

We could vendor this soon, but it would be a long time before our operator was hitting a 1.18 control-plane.

@sttts
Copy link
Contributor

sttts commented Apr 28, 2020

Has nothing to do with the control plane.

@dgoodwin dgoodwin changed the title Is the established precedent for handling cert rotation? Is there established precedent for handling cert rotation? Apr 28, 2020
@dgoodwin
Copy link
Contributor Author

We implemented an external monitor in our operator and so far this seems to have been working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants