-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: CrashLoopBackOff once tlsProfile changed #640
Conversation
Skipping CI for Draft Pull Request. |
Hi, if this PR is still under work, can you please convert it to Draft? so CI won't run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this goes into the right direction. The GetConfigForClient
functions still need to be deduplicated, improved and should not be lambda functions.
controllers/ssp_controller.go
Outdated
@@ -239,9 +239,6 @@ func (r *sspReconciler) isRestartNeeded(sspObj *ssp.SSP) bool { | |||
if reflect.DeepEqual(r.lastSspSpec, ssp.SSPSpec{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This func is never returning true
, so it can likely be dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed my comment here, because it was irrelevant. Sorry for the noise.
main.go
Outdated
os.Exit(1) | ||
} | ||
|
||
var webhookServer *webhook.Server | ||
if tlsOptions.IsEmpty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we can do this. I don't think this allows to change the config later without restart?
tests/crypto_policy_test.go
Outdated
@@ -65,7 +64,7 @@ var _ = Describe("Crypto Policy", func() { | |||
}, | |||
}, | |||
} | |||
|
|||
// is not supported anymore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean?
tests/crypto_policy_test.go
Outdated
Namespace: strategy.GetSSPDeploymentNameSpace(), | ||
}, deployment) | ||
Expect(err).ToNot(HaveOccurred()) | ||
Expect(deployment.Status.ReadyReplicas).To(BeNumerically(">=", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready replicas is dangerous, might better use UpdatedReplicas
. @jcanocan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check both just to be sure.
main.go
Outdated
tlsOptions, err := common.GetSspTlsOptions(ctx) | ||
if err != nil { | ||
setupLog.Error(err, "Error while getting tls profile") | ||
os.Exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not exit here, better return an error.
main.go
Outdated
} | ||
cfg.CipherSuites = tlsOptions.CipherIDs(&setupLog) | ||
cfg.MinVersion, _ = minTLSVersionId(tlsOptions.MinTLSVersion) | ||
certificate, err := loadCertificates(path.Join(sdkTLSDir, sdkTLSCrt), path.Join(sdkTLSDir, sdkTLSKey)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how expensive it is to load certs on every request. You might cache some of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There is already a certificate watcher in the controller-runtime
library. Maybe we can use that, I will check...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simply create a new CertWatcher
and then add it to the manager
to start it. Or call Start()
manually in a separate goroutine.
main.go
Outdated
// webhook server configuration is used. | ||
if sspTLSOptions.IsEmpty() { | ||
return nil | ||
// vm-console-proxy function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a link to that function? e.g. so I can compare the changes here and there, if there any changes.
main.go
Outdated
|
||
funcs := []func(*tls.Config){tlsCfgFunc} | ||
return &webhook.Server{Port: webhookPort, TLSMinVersion: sspTLSOptions.MinTLSVersion, TLSOpts: funcs} | ||
// crypto_policy.go function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a link to that function, in that file?
main.go
Outdated
Addr: metricsAddr, | ||
Handler: mux, | ||
TLSConfig: &tls.Config{ | ||
GetConfigForClient: func(_ *tls.ClientHelloInfo) (*tls.Config, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please extract it to a new function, to keep runPrometheusServer()
short?
main.go
Outdated
@@ -138,26 +180,63 @@ func main() { | |||
|
|||
ctx := ctrl.SetupSignalHandler() | |||
|
|||
tlsOptions, err := common.GetSspTlsOptions(ctx) | |||
tlsCfgFunc := func(cfg *tls.Config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to keep that function outside of main()
? I think it's a good practice to keep main()
short function, it's easier to read the code that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently main()
is very long and should be very short, in general.
Also I'd suggest to have shorter commit title, and update the PR description to be the same as commit description (so you won't forget to sync them later). Now you have:
I'd suggest to give it a name according to "This commit will fix: CrashLoopBackOff once tlsProfile changed" sentence. For example:
|
main.go
Outdated
cfg.MinVersion, _ = minTLSVersionId(tlsOptions.MinTLSVersion) | ||
certificate, err := loadCertificates(path.Join(sdkTLSDir, sdkTLSCrt), path.Join(sdkTLSDir, sdkTLSKey)) | ||
if err != nil { | ||
setupLog.Error(err, "Prometheus server error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to log the error here. Just return the error and the calling function will log it for you
tests/crypto_policy_test.go
Outdated
Namespace: strategy.GetSSPDeploymentNameSpace(), | ||
}, deployment) | ||
Expect(err).ToNot(HaveOccurred()) | ||
Expect(deployment.Status.ReadyReplicas).To(BeNumerically(">=", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check both just to be sure.
main.go
Outdated
@@ -66,53 +68,115 @@ const ( | |||
webhookPort = 9443 | |||
) | |||
|
|||
func runPrometheusServer(metricsAddr string, tlsOptions common.SSPTLSOptions) error { | |||
// crypto_policy.go function | |||
func minTLSVersionId(s string) (uint16, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the func in internal/common/crypto_policy.go
?
main.go
Outdated
return &tls.Config{ | ||
GetConfigForClient: func(_ *tls.ClientHelloInfo) (*tls.Config, error) { | ||
cfg := &tls.Config{} | ||
tlsOptions, err := common.GetSspTlsOptions(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
common.GetSspTlsOptions
is making API calls for every request if it stays like this. IMO we should have a Watcher/SharedInformer for the SSP CR and use it to retrieve the SSP if it was changed. Otherwise the SSP CR should be cached.
main.go
Outdated
return cfg, nil | ||
} | ||
cfg.CipherSuites = tlsOptions.CipherIDs(&setupLog) | ||
setupLog.Info("Configured ciphers", "ciphers", cfg.CipherSuites) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we don't need to log this on every request.
main.go
Outdated
// please be aware that the APIServer is using http keepalive so this is going to | ||
// be executed only after a while for fresh connections and not on existing ones | ||
cfg.GetConfigForClient = func(_ *tls.ClientHelloInfo) (*tls.Config, error) { | ||
tlsOptions, err := common.GetSspTlsOptions(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as above, SSP CR needs to be cached.
main.go
Outdated
} | ||
|
||
cfg.CipherSuites = tlsOptions.CipherIDs(&setupLog) | ||
setupLog.Info("Configured ciphers", "ciphers", cfg.CipherSuites) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
f886ca7
to
82e898e
Compare
main.go
Outdated
GetConfigForClient: func(_ *tls.ClientHelloInfo) (*tls.Config, error) { | ||
cfg := &tls.Config{} | ||
var err error | ||
cfg.GetCertificate = certWatcher.GetCertificate | ||
|
||
if controllers.TLSProfile == nil { | ||
cfg.MinVersion = crypto.DefaultTLSVersion() | ||
cfg.CipherSuites = nil | ||
return cfg, nil | ||
} | ||
|
||
if controllers.TLSProfile.Type == ocpconfigv1.TLSProfileCustomType { | ||
cfg.CipherSuites = common.CipherIDs(controllers.TLSProfile.Custom.Ciphers) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(controllers.TLSProfile.Custom.MinTLSVersion)) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return cfg, nil | ||
} | ||
|
||
cipherNames, minTypedTLSVersion := ocpconfigv1.TLSProfiles[controllers.TLSProfile.Type].Ciphers, ocpconfigv1.TLSProfiles[controllers.TLSProfile.Type].MinTLSVersion | ||
cfg.CipherSuites = common.CipherIDs(cipherNames) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(minTypedTLSVersion)) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return cfg, nil | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extract this to an external function? It is duplicated on getTLSConfigFunc()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed
controllers/ssp_controller.go
Outdated
@@ -67,6 +66,8 @@ var kvsspCRDs = map[string]string{ | |||
"kubevirtcommontemplatesbundles.ssp.kubevirt.io": "KubevirtCommonTemplatesBundle", | |||
} | |||
|
|||
var TLSProfile *osconfv1.TLSSecurityProfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid globals.
@@ -192,8 +193,8 @@ func (ti *TLSInfo) CreateTlsConfig() *tls.Config { | |||
} | |||
|
|||
if !ti.sspTLSOptions.IsEmpty() { | |||
tlsConfig.CipherSuites = ti.sspTLSOptions.CipherIDs(nil) | |||
minVersion, err := ti.sspTLSOptions.MinTLSVersionId() | |||
tlsConfig.CipherSuites = common.CipherIDs(ti.sspTLSOptions.OpenSSLCipherNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a lib for this too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is this function, but the problem is that it causes panic for some cipher names and we just want to log that the cipher is not supported
ssp-operator/vendor/github.com/openshift/library-go/pkg/crypto/crypto.go
Lines 233 to 245 in fc1d67c
func CipherSuitesOrDie(cipherNames []string) []uint16 { | |
if len(cipherNames) == 0 { | |
return DefaultCiphers() | |
} | |
cipherValues := []uint16{} | |
for _, cipherName := range cipherNames { | |
cipher, err := CipherSuite(cipherName) | |
if err != nil { | |
panic(err) | |
} | |
cipherValues = append(cipherValues, cipher) | |
} | |
return cipherValues |
main.go
Outdated
} | ||
|
||
if controllers.TLSProfile.Type == ocpconfigv1.TLSProfileCustomType { | ||
cfg.CipherSuites = common.CipherIDs(controllers.TLSProfile.Custom.Ciphers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you dedup the calls to common.CipherIDs
and crypto.TLSVersion
? This block and the block below look very similar.
main.go
Outdated
GetConfigForClient: func(_ *tls.ClientHelloInfo) (*tls.Config, error) { | ||
cfg := &tls.Config{} | ||
var err error | ||
cfg.GetCertificate = certWatcher.GetCertificate | ||
|
||
if controllers.TLSProfile == nil { | ||
cfg.MinVersion = crypto.DefaultTLSVersion() | ||
cfg.CipherSuites = nil | ||
return cfg, nil | ||
} | ||
|
||
if controllers.TLSProfile.Type == ocpconfigv1.TLSProfileCustomType { | ||
cfg.CipherSuites = common.CipherIDs(controllers.TLSProfile.Custom.Ciphers) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(controllers.TLSProfile.Custom.MinTLSVersion)) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return cfg, nil | ||
} | ||
|
||
cipherNames, minTypedTLSVersion := ocpconfigv1.TLSProfiles[controllers.TLSProfile.Type].Ciphers, ocpconfigv1.TLSProfiles[controllers.TLSProfile.Type].MinTLSVersion | ||
cfg.CipherSuites = common.CipherIDs(cipherNames) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(minTypedTLSVersion)) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return cfg, nil | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed
/cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A partial review. I will look at it more later.
@@ -192,8 +193,8 @@ func (ti *TLSInfo) CreateTlsConfig() *tls.Config { | |||
} | |||
|
|||
if !ti.sspTLSOptions.IsEmpty() { | |||
tlsConfig.CipherSuites = ti.sspTLSOptions.CipherIDs(nil) | |||
minVersion, err := ti.sspTLSOptions.MinTLSVersionId() | |||
tlsConfig.CipherSuites = common.CipherIDs(ti.sspTLSOptions.OpenSSLCipherNames, &logger.Log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the nil
log here, because the change is not part of this PR. If you want, you can open a new PR to add the log.
tlsConfig.CipherSuites = ti.sspTLSOptions.CipherIDs(nil) | ||
minVersion, err := ti.sspTLSOptions.MinTLSVersionId() | ||
tlsConfig.CipherSuites = common.CipherIDs(ti.sspTLSOptions.OpenSSLCipherNames, &logger.Log) | ||
minVersion, err := crypto.TLSVersion(ti.sspTLSOptions.MinTLSVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct, because crypto.TLSVersion()
function expects one of these names:
var versions = map[string]uint16{ | |
"VersionTLS10": tls.VersionTLS10, | |
"VersionTLS11": tls.VersionTLS11, | |
"VersionTLS12": tls.VersionTLS12, | |
"VersionTLS13": tls.VersionTLS13, | |
} |
So a string like: VersionTLS12
, but ti.sspTLSOptions.MinTLSVersion
is a string returned by this function:
ssp-operator/internal/common/crypto_policy.go
Lines 119 to 134 in 11c2e89
func tlsVersionToHumanReadable(version ocpv1.TLSProtocolVersion) (string, error) { | |
switch version { | |
case "": | |
return "", nil | |
case ocpv1.VersionTLS10: | |
return "1.0", nil | |
case ocpv1.VersionTLS11: | |
return "1.1", nil | |
case ocpv1.VersionTLS12: | |
return "1.2", nil | |
case ocpv1.VersionTLS13: | |
return "1.3", nil | |
default: | |
return "", fmt.Errorf("invalid ocpv1.VersionTLS %v", version) | |
} | |
} |
For example: 1.2
.
main.go
Outdated
"github.com/prometheus/client_golang/prometheus/promhttp" | ||
ssp "kubevirt.io/ssp-operator/api/v1beta2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move local imports to the block below.
main.go
Outdated
// This callback executes on each client call returning a new config to be used | ||
// please be aware that the APIServer is using http keepalive so this is going to | ||
// be executed only after a while for fresh connections and not on existing ones | ||
func getConfigForClientCallback(cfg *tls.Config, apiClient client.Client, ctx context.Context) (*tls.Config, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three comments for this function:
- After thinking about this more, I think it is better to have a parameter type
cache.Cache
instead ofclient.Client
. It has the same reading methods asClient
, but cannot be used for writing. - Usually the
context.Context
parameter is the first. - Please don't include the word
Callback
in a function name. In my opinion, it's better to name a function, by what it does, then by how it is used.
main.go
Outdated
// please be aware that the APIServer is using http keepalive so this is going to | ||
// be executed only after a while for fresh connections and not on existing ones | ||
func getConfigForClientCallback(cfg *tls.Config, apiClient client.Client, ctx context.Context) (*tls.Config, error) { | ||
var err error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can be removed, because err
is redefined below.
cfg.CipherSuites = common.CipherIDs(tlsProfile.Custom.Ciphers, &setupLog) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(tlsProfile.Custom.MinTLSVersion)) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this error is returned, the cfg
parameter has already been modified. Can you change the code, so that on error, the cfg
parameter is kept unchanged?
cfg.CipherSuites = common.CipherIDs(cipherNames, &setupLog) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(minTypedTLSVersion)) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, please keep the cfg
unchanged on error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too?
main.go
Outdated
} | ||
} | ||
|
||
func getPrometheusServer(metricsAddr string, ctx context.Context) (*prometheusServer, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename this function to newPrometheusServer()
? Because it is a constructor.
main.go
Outdated
setupLog.Info("Starting Prometheus metrics endpoint server with TLS") | ||
metrics.Registry.MustRegister(common_templates.CommonTemplatesRestored) | ||
metrics.Registry.MustRegister(common.SSPOperatorReconcileSucceeded) | ||
handler := promhttp.HandlerFor(metrics.Registry, promhttp.HandlerOpts{}) | ||
mux := http.NewServeMux() | ||
mux.Handle("/metrics", handler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this logic to prometheusServer.Start()
method? It makes sense to call it when the server is started, not when it's created.
main.go
Outdated
certWatcher, err := certwatcher.New(metricsServer.certPath, metricsServer.keyPath) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
tlsCfgFunc := func(cfg *tls.Config) { | ||
cfg.CipherSuites = sspTLSOptions.CipherIDs(&setupLog) | ||
setupLog.Info("Configured ciphers", "ciphers", cfg.CipherSuites) | ||
go func() { | ||
if err := certWatcher.Start(ctx); err != nil { | ||
setupLog.Error(err, "certificate watcher error") | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, can you move the certWatcher
creation and start to the prometheusServer.Start()
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments
main.go
Outdated
var tlsProfile *ocpconfigv1.TLSSecurityProfile | ||
|
||
if len(sspList.Items) != 0 { | ||
tlsProfile = sspList.Items[0].Spec.TLSSecurityProfile | ||
} | ||
|
||
if tlsProfile == nil { | ||
cfg.MinVersion = crypto.DefaultTLSVersion() | ||
cfg.CipherSuites = nil | ||
return cfg, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var tlsProfile *ocpconfigv1.TLSSecurityProfile | |
if len(sspList.Items) != 0 { | |
tlsProfile = sspList.Items[0].Spec.TLSSecurityProfile | |
} | |
if tlsProfile == nil { | |
cfg.MinVersion = crypto.DefaultTLSVersion() | |
cfg.CipherSuites = nil | |
return cfg, nil | |
} | |
if len(sspList.Items) == 0 { | |
cfg.MinVersion = crypto.DefaultTLSVersion() | |
cfg.CipherSuites = nil | |
return cfg, nil | |
} | |
tlsProfile := sspList.Items[0].Spec.TLSSecurityProfile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to check if spec.TLSSecurityProfile
is nil or not, so we would need to do it like this
if len(sspList.Items) == 0 || sspList.Items[0].Spec.TLSSecurityProfile == nil {
cfg.MinVersion = crypto.DefaultTLSVersion()
cfg.CipherSuites = nil
return cfg, nil
}
tlsProfile := sspList.Items[0].Spec.TLSSecurityProfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
main.go
Outdated
return cfg, nil | ||
} | ||
|
||
cipherNames := ocpconfigv1.TLSProfiles[tlsProfile.Type].Ciphers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe inline this?
cfg.CipherSuites = common.CipherIDs(cipherNames, &setupLog) | ||
cfg.MinVersion, err = crypto.TLSVersion(string(minTypedTLSVersion)) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too?
main.go
Outdated
cipherNames := ocpconfigv1.TLSProfiles[tlsProfile.Type].Ciphers | ||
cfg.CipherSuites = common.CipherIDs(cipherNames, &ctrl.Log) | ||
|
||
minTypedTLSVersion := ocpconfigv1.TLSProfiles[tlsProfile.Type].MinTLSVersion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too?
main.go
Outdated
setupLog.Info("Starting Prometheus metrics endpoint server with TLS") | ||
metrics.Registry.MustRegister(common_templates.CommonTemplatesRestored) | ||
metrics.Registry.MustRegister(common.SSPOperatorReconcileSucceeded) | ||
handler := promhttp.HandlerFor(metrics.Registry, promhttp.HandlerOpts{}) | ||
mux := http.NewServeMux() | ||
mux.Handle("/metrics", handler) | ||
|
||
minTlsVersion, err := tlsOptions.MinTLSVersionId() | ||
s.server.Handler = mux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge it with L138?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not possible to call s.server.Handler.Handle()
function, so we probably can not do it like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
tests/crypto_policy_test.go
Outdated
Namespace: strategy.GetSSPDeploymentNameSpace(), | ||
}, deployment) | ||
Expect(err).ToNot(HaveOccurred()) | ||
Expect(deployment.Status.ReadyReplicas).To(BeNumerically(">=", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should assert that we have the amount of ready and updated replicaes specified in spec.Replicas
. Not just >= 1
.
@0xFelix @akrejcir @opokornyy I have two questions:
Thanks! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question and one nit.
- IMO this should be fixed in a follow up PR. It has nothing to do with SSP restarting.
- Good point!
main.go
Outdated
} | ||
|
||
tlsProfile := sspList.Items[0].Spec.TLSSecurityProfile | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline
tests/crypto_policy_test.go
Outdated
g.Expect(err).ToNot(HaveOccurred()) | ||
g.Expect(deployment.Status.ReadyReplicas).To(BeNumerically(">=", 1)) | ||
}, env.Timeout(), time.Second).Should(Succeed()) | ||
deployment := &apps.Deployment{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need Eventually
here, or does strategy.RevertToOriginalSspCr
wait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should wait, based on the previous comment the waitting was needed just because of the bug
ssp-operator/tests/crypto_policy_test.go
Lines 110 to 112 in 96302e1
// Because of bug[1], the SSP operator will move to CrashLoopBackOff state, | |
// so we need to wait until it is running. | |
// [1] - https://bugzilla.redhat.com/show_bug.cgi?id=2151248 |
ssp-operator/tests/tests_suite_test.go
Lines 360 to 364 in 96302e1
func (s *existingSspStrategy) RevertToOriginalSspCr() { | |
waitForSspDeletionIfNeeded(s.ssp) | |
createOrUpdateSsp(s.ssp) | |
waitUntilDeployed() | |
} |
tests/crypto_policy_test.go
Outdated
deployment := &apps.Deployment{} | ||
err := apiClient.Get(ctx, client.ObjectKey{ | ||
Name: strategy.GetSSPDeploymentName(), | ||
Namespace: strategy.GetSSPDeploymentNameSpace(), | ||
}, deployment) | ||
Expect(err).ToNot(HaveOccurred()) | ||
Expect(deployment.Status.ReadyReplicas).To(Equal(*deployment.Spec.Replicas)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, can't we drop all of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we can drop it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! It looks good to me.
/approve
/retest-required
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 0xFelix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
TLSConfig: &tlsConfig, | ||
} | ||
go func() { | ||
if err := certWatcher.Start(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we should use a different context for the certWatcher
than ctx
, so that it can be closed when this function returns on error.
We can do it in a follow-up PR.
server.TLSConfig = s.getPrometheusTLSConfig(ctx, certWatcher) | ||
|
||
if err := server.ListenAndServeTLS(s.certPath, s.keyPath); err != nil { | ||
setupLog.Error(err, "Failed to start Prometheus metrics endpoint server") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error should be returned from this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for ListenAndServeTLS()
says that it always returns a non-nil error:
https://github.com/golang/go/blob/0ae54ddd37302bdd2a8c775135bf5f076a18eeb3/src/net/http/server.go#L3281-L3283
These links are for go 1.19
but the same comments are in master
.
err := server.ListenAndServeTLS(path.Join(sdkTLSDir, sdkTLSCrt), path.Join(sdkTLSDir, sdkTLSKey)) | ||
if err != nil { | ||
setupLog.Error(err, "Failed to start Prometheus metrics endpoint server") | ||
<-ctx.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When below server.ListenAndServeTLS()
returns because of an error, and this Start()
function returns, this goroutine will still be waiting until context is closed.
Can you change the code, so that before the Start()
returns, it makes sure that this goroutine has finished?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reconsidered this.
In this PR, we are already letting the above goroutine leak, from the Start()
function, so this one can be left as is.
Can you add a // TODO
comment to the code next to both goroutines, and open a new github issue, so that we don't forget to open a followup PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This commit adds the usage of callbacks to Prometheus and Webhook server to fetch TLS config on every request. Signed-off-by: Ondrej Pokorny <opokorny@redhat.com>
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
/lgtm |
/cherry-pick release-v0.18 |
@opokornyy: new pull request created: #668 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What this PR does / why we need it:
This PR adds the usage of callbacks to Prometheus and the webhook server, enabling the dynamic reloading of the TLS configuration. This eliminates the necessity of restarting the SSP pod, which had previously led to the pod entering the CrashLoopBackOff state.
Fixes #
https://bugzilla.redhat.com/show_bug.cgi?id=2151248
Release note: