-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add certificate expiry check, events, and metrics #9772
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
brandond
changed the title
Add certificate expiry check and warnings
[WIP] Add certificate expiry check and warnings
Mar 23, 2024
Example: root@k3s-server-1:/# k3s certificate check
INFO[0000] Server detected, checking agent and server certificates
INFO[0000] Checking certificates for k3s-controller
INFO[0000] /var/lib/rancher/k3s/server/tls/client-k3s-controller.crt: certificate CN=system:k3s-controller is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-k3s-controller.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/client-k3s-controller.crt: certificate CN=system:k3s-controller is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/client-k3s-controller.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for admin
INFO[0000] /var/lib/rancher/k3s/server/tls/client-admin.crt: certificate CN=system:admin,O=system:masters is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-admin.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for auth-proxy
INFO[0000] /var/lib/rancher/k3s/server/tls/client-auth-proxy.crt: certificate CN=system:auth-proxy is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-auth-proxy.crt: certificate CN=k3s-request-header-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for controller-manager
INFO[0000] /var/lib/rancher/k3s/server/tls/client-controller.crt: certificate CN=system:kube-controller-manager is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-controller.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for kubelet
INFO[0000] /var/lib/rancher/k3s/agent/client-kubelet.crt: certificate CN=system:node:k3s-server-1,O=system:nodes is ok, expires at 2025-03-25T22:34:30Z
INFO[0000] /var/lib/rancher/k3s/agent/client-kubelet.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/serving-kubelet.crt: certificate CN=k3s-server-1 is ok, expires at 2025-03-25T22:34:30Z
INFO[0000] /var/lib/rancher/k3s/agent/serving-kubelet.crt: certificate CN=k3s-server-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for kube-proxy
INFO[0000] /var/lib/rancher/k3s/server/tls/client-kube-proxy.crt: certificate CN=system:kube-proxy is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-kube-proxy.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/client-kube-proxy.crt: certificate CN=system:kube-proxy is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/client-kube-proxy.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for api-server
INFO[0000] /var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt: certificate CN=system:apiserver,O=system:masters is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt: certificate CN=kube-apiserver is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt: certificate CN=k3s-server-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for cloud-controller
INFO[0000] /var/lib/rancher/k3s/server/tls/client-k3s-cloud-controller.crt: certificate CN=k3s-cloud-controller-manager is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-k3s-cloud-controller.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for etcd
INFO[0000] /var/lib/rancher/k3s/server/tls/etcd/client.crt: certificate CN=etcd-client is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/etcd/client.crt: certificate CN=etcd-server-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/etcd/server-client.crt: certificate CN=etcd-server is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/etcd/server-client.crt: certificate CN=etcd-server-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.crt: certificate CN=etcd-peer is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.crt: certificate CN=etcd-peer-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for scheduler
INFO[0000] /var/lib/rancher/k3s/server/tls/client-scheduler.crt: certificate CN=system:kube-scheduler is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/server/tls/client-scheduler.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z root@k3s-agent-1:/# k3s certificate check
INFO[0000] Agent detected, checking agent certificates
INFO[0000] Checking certificates for kube-proxy
INFO[0000] /var/lib/rancher/k3s/agent/client-kube-proxy.crt: certificate CN=system:kube-proxy is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/client-kube-proxy.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for kubelet
INFO[0000] /var/lib/rancher/k3s/agent/client-kubelet.crt: certificate CN=system:node:k3s-agent-1,O=system:nodes is ok, expires at 2025-03-25T22:34:36Z
INFO[0000] /var/lib/rancher/k3s/agent/client-kubelet.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/serving-kubelet.crt: certificate CN=k3s-agent-1 is ok, expires at 2025-03-25T22:34:36Z
INFO[0000] /var/lib/rancher/k3s/agent/serving-kubelet.crt: certificate CN=k3s-server-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z
INFO[0000] Checking certificates for k3s-controller
INFO[0000] /var/lib/rancher/k3s/agent/client-k3s-controller.crt: certificate CN=system:k3s-controller is ok, expires at 2025-03-25T22:34:28Z
INFO[0000] /var/lib/rancher/k3s/agent/client-k3s-controller.crt: certificate CN=k3s-client-ca@1711406068 is ok, expires at 2034-03-23T22:34:28Z brandond@dev01:~$ kubectl get --raw /api/v1/nodes/k3s-server-1/proxy/metrics | grep k3s_certificate_expiration
# HELP k3s_certificate_expiration_seconds Remaining lifetime on the certificate.
# TYPE k3s_certificate_expiration_seconds gauge
k3s_certificate_expiration_seconds{subject="CN=etcd-client",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=etcd-peer",usages="ServerAuth,ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=etcd-peer-ca@1711483842",usages="CertSign"} 3.153599970558642e+08
k3s_certificate_expiration_seconds{subject="CN=etcd-server",usages="ServerAuth,ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=etcd-server-ca@1711483842",usages="CertSign"} 3.153599970558642e+08
k3s_certificate_expiration_seconds{subject="CN=k3s-client-ca@1711483842",usages="CertSign"} 3.153599970558642e+08
k3s_certificate_expiration_seconds{subject="CN=k3s-cloud-controller-manager",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=k3s-request-header-ca@1711483842",usages="CertSign"} 3.153599970558642e+08
k3s_certificate_expiration_seconds{subject="CN=k3s-server-1",usages="ServerAuth"} 3.15359980568481e+07
k3s_certificate_expiration_seconds{subject="CN=k3s-server-ca@1711483842",usages="CertSign"} 3.153599970558642e+08
k3s_certificate_expiration_seconds{subject="CN=kube-apiserver",usages="ServerAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:admin,O=system:masters",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:apiserver,O=system:masters",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:auth-proxy",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:k3s-controller",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:kube-controller-manager",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:kube-proxy",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:kube-scheduler",usages="ClientAuth"} 3.15359970568481e+07
k3s_certificate_expiration_seconds{subject="CN=system:node:k3s-server-1,O=system:nodes",usages="ClientAuth"} 3.15359980568481e+07
brandond@dev01:~$ kubectl get event
LAST SEEN TYPE REASON OBJECT MESSAGE
2m26s Warning CertificateExpirationWarning node/k3s-server-1 Node certificates require attention - restart k3s on this node to trigger automatic rotation: admin/client-admin.crt: certificate CN=system:admin,O=system:masters will expire within 90 days at 2024-04-27T18:56:27Z, api-server/client-kube-apiserver.crt: certificate CN=system:apiserver,O=system:masters will expire within 90 days at 2024-04-27T18:56:27Z, api-server/serving-kube-apiserver.crt: certificate CN=kube-apiserver will expire within 90 days at 2024-04-27T18:56:27Z, auth-proxy/client-auth-proxy.crt: certificate CN=system:auth-proxy will expire within 90 days at 2024-04-27T18:56:27Z, cloud-controller/client-k3s-cloud-controller.crt: certificate CN=k3s-cloud-controller-manager will expire within 90 days at 2024-04-27T18:56:27Z, controller-manager/client-controller.crt: certificate CN=system:kube-controller-manager will expire within 90 days at 2024-04-27T18:56:27Z, etcd/client.crt: certificate CN=etcd-client will expire within 90 days at 2024-04-27T18:56:27Z, etcd/server-client.crt: certificate CN=etcd-server will expire within 90 days at 2024-04-27T18:56:27Z, etcd/peer-server-client.crt: certificate CN=etcd-peer will expire within 90 days at 2024-04-27T18:56:27Z, scheduler/client-scheduler.crt: certificate CN=system:kube-scheduler will expire within 90 days at 2024-04-27T18:56:27Z, kube-proxy/client-kube-proxy.crt: certificate CN=system:kube-proxy will expire within 90 days at 2024-04-27T18:56:27Z, kube-proxy/client-kube-proxy.crt: certificate CN=system:kube-proxy will expire within 90 days at 2024-04-27T18:56:27Z, kubelet/client-kubelet.crt: certificate CN=system:node:k3s-server-1,O=system:nodes will expire within 90 days at 2024-04-27T18:56:29Z, kubelet/serving-kubelet.crt: certificate CN=k3s-server-1 will expire within 90 days at 2024-04-27T18:56:28Z, k3s-controller/client-k3s-controller.crt: certificate CN=system:k3s-controller will expire within 90 days at 2024-04-27T18:56:27Z, k3s-controller/client-k3s-controller.crt: certificate CN=system:k3s-controller will expire within 90 days at 2024-04-27T18:56:27Z |
brandond
force-pushed
the
cert-expire-warning
branch
from
March 26, 2024 00:43
fe06dc7
to
a24a5fa
Compare
brandond
changed the title
[WIP] Add certificate expiry check and warnings
Add certificate expiry check and warnings
Mar 26, 2024
brandond
force-pushed
the
cert-expire-warning
branch
from
March 26, 2024 00:55
a24a5fa
to
5870bde
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9772 +/- ##
===========================================
- Coverage 52.94% 40.87% -12.08%
===========================================
Files 154 153 -1
Lines 13601 13667 +66
===========================================
- Hits 7201 5586 -1615
- Misses 5038 6957 +1919
+ Partials 1362 1124 -238
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
brandond
force-pushed
the
cert-expire-warning
branch
2 times, most recently
from
March 26, 2024 03:31
ee1769a
to
0f026a2
Compare
vitorsavian
previously approved these changes
Mar 26, 2024
brandond
force-pushed
the
cert-expire-warning
branch
from
March 26, 2024 20:14
0f026a2
to
c4dffc5
Compare
brandond
changed the title
Add certificate expiry check and warnings
Add certificate expiry check, events, and metrics
Mar 26, 2024
* Add ADR * Add `k3s certificate check` command. * Add periodic check and events when certs are about to expire. * Add metrics for certificate validity remaining, labeled by cert subject Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond
force-pushed
the
cert-expire-warning
branch
from
March 26, 2024 21:13
c4dffc5
to
a49efe4
Compare
briandowns
approved these changes
Mar 27, 2024
galal-hussein
approved these changes
Mar 27, 2024
vitorsavian
approved these changes
Mar 28, 2024
This was referenced Apr 9, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed Changes
k3s certificate rotate
failing on agents due to missing server token filek3s certificate check
command, to check and print the status of server and agent certificatesTypes of Changes
enhancement
Verification
CATTLE_NEW_SIGNED_CERT_EXPIRATION_DAYS=30
in service environment to trigger short-lived certsk3s certificate check
commandTesting
Linked Issues
k3s certificate rotate
fails on agents #9785User-Facing Change
Further Comments