security: bugfix, ensure cert expiry metrics reflect reloaded certs#135596
Conversation
5b5a0d2 to
f609790
Compare
dhartunian
left a comment
There was a problem hiding this comment.
Generally LGTM, just some minor notes. Please add backport labels as well.
Reviewed 1 of 3 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons and @kyle-a-wong)
pkg/security/certificate_metrics.go line 168 at r1 (raw file):
) type certClosure func() *CertInfo
I think if you're going to name this type, might as well add a docstring explaining why it's needed.
pkg/security/certificate_metrics.go line 212 at r1 (raw file):
b := aggmetric.MakeBuilder(SQLUserLabel) return &Metrics{ CAExpiration: expirationGauge(metaCAExpiration, cm.CACert),
I don't see the diff that makes this exported. Am I missing something?
pkg/security/certificate_metrics.go line 213 at r1 (raw file):
return &Metrics{ CAExpiration: expirationGauge(metaCAExpiration, cm.CACert), TenantExpiration: expirationGauge(metaTenantExpiration, func() *CertInfo { return cm.tenantCert }),
should we use certClosure here and below?
angles-n-daemons
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dhartunian and @kyle-a-wong)
pkg/security/certificate_metrics.go line 168 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
I think if you're going to name this type, might as well add a docstring explaining why it's needed.
that's a good idea, adding it now.
pkg/security/certificate_metrics.go line 212 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
I don't see the diff that makes this exported. Am I missing something?
cm.caCert is a reference, cm.CACert is a function. It's not just that the latter is public, it's that it always returns the right certificate.
pkg/security/certificate_metrics.go line 213 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
should we use
certClosurehere and below?
I'm not sure I understand how, do you mean to cast the whole thing explicitly?
dhartunian
left a comment
There was a problem hiding this comment.
LGTM. No further comments, thanks.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @angles-n-daemons and @kyle-a-wong)
pkg/security/certificate_metrics.go line 212 at r1 (raw file):
Previously, angles-n-daemons (Brian Dillmann) wrote…
cm.caCert is a reference, cm.CACert is a function. It's not just that the latter is public, it's that it always returns the right certificate.
Ah gotcha. Thanks for clarifying.
pkg/security/certificate_metrics.go line 213 at r1 (raw file):
Previously, angles-n-daemons (Brian Dillmann) wrote…
I'm not sure I understand how, do you mean to cast the whole thing explicitly?
My brain wasn't working. Disregard :) I think I was just visually pattern matching and forgot that the custom type does not magically become a new function constructor.
The PR cockroachdb#130110 added certificate TTL metrics alongside our existing expiration metrics. Prior to that change, the certificate metrics values were updated on each metrics load. Afterwards, new metrics objects were created for each load of certificates. This created a bug in that the new expiration values would not be found in any of the system exhaust (metrics scrape or tsdb) because the registered metrics objects were the ones created on startup. This new change instead allows the metrics to close the whole CertificateManager object, so that they only need to be created once, and therefore the initial registration of metrics reflects persistently valid values. Release note (bug fix): security.certificate.* metrics will now be updated if a node loads new certificates while running.
f609790 to
0e19000
Compare
|
bors r+ |
security: bugfix, ensure cert expiry metrics reflect reloaded certs
The PR #130110 added certificate TTL metrics alongside our existing expiration metrics. Prior to that change, the certificate metrics values were updated on each metrics load. Afterwards, new metrics objects were created for each load of certificates.
This created a bug in that the new expiration values would not be found in any of the system exhaust (metrics scrape or tsdb) because the registered metrics objects were the ones created on startup.
This new change instead allows the metrics to close the whole CertificateManager object, so that they only need to be created once, and therefore the initial registration of metrics reflects persistently valid values.
Release note (bug fix): security.certificate.* metrics will now be updated if a node loads new certificates while running.
Epic: none
Fixes: #135093