-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gateway2: Add readiness and liveness probes to the kube gloo gateway pod #10229
Conversation
Issues linked to changelog: |
- `kubeGateway.gatewayParameters.glooGateway.podTemplate.terminationGracePeriodSeconds` to specify the terminationGracePeriodSeconds. | ||
- `kubeGateway.gatewayParameters.glooGateway.podTemplate.gracefulShutdown` to configure the graceful shutdown config for the envoy container. | ||
- `kubeGateway.gatewayParameters.glooGateway.podTemplate.customLivenessProbe` to specify a custom liveness probe for the envoy container. | ||
- `kubeGateway.gatewayParameters.glooGateway.podTemplate.livenessProbeEnabled` to enable the liveness probe. If the customLivenessProbe is not specified, no liveness probe is set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `kubeGateway.gatewayParameters.glooGateway.podTemplate.livenessProbeEnabled` to enable the liveness probe. If the customLivenessProbe is not specified, no liveness probe is set. |
should we also remove the option in other places from the helm values?
install/helm/gloo/generate/values.go
Outdated
@@ -339,9 +339,19 @@ type GatewayParameters struct { | |||
Stats *GatewayParamsStatsConfig `json:"stats,omitempty" desc:"Config used to manage the stats endpoints exposed on the deployed proxies"` | |||
AIExtension *GatewayParamsAIExtension `json:"aiExtension,omitempty" desc:"Config used to manage the Gloo Gateway AI extension."` | |||
FloatingUserId *bool `json:"floatingUserId,omitempty" desc:"If true, allows the cluster to dynamically assign a user ID for the processes running in the container. Default is false."` | |||
PodTemplate *GatewayParamsPodTemplate `json:"podTemplate,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add description
type GatewayParamsPodTemplate struct { | ||
GracefulShutdown *GracefulShutdownSpec `json:"gracefulShutdown,omitempty"` | ||
TerminationGracePeriodSeconds *int `json:"terminationGracePeriodSeconds,omitempty" desc:"Time in seconds to wait for the pod to terminate gracefully. See [kubernetes docs](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#istio-lifecycle) for more info."` | ||
Probes *bool `json:"probes,omitempty" desc:"Set to true to enable a readiness probe (default is false). Then, you can also enable a liveness probe."` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is a bool flag can we call it enableProbes
?
also, a little confused about the description:
Set to true to enable a readiness probe (default is false). Then, you can also enable a liveness probe.
should there be 2 different flags, enableLivenessProbe and enableReadinessProbe? or does this one flag enable both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to mimic the edge classic gateway proxy settings where probes controls the readiness probe.
Since there's no default liveness probe, a custom one can be set with the customLivenessProbe without the need for a bool to enable it
gloo/install/helm/gloo/generate/values.go
Line 677 in 8f097f5
Probes *bool `json:"probes,omitempty" desc:"Set to true to enable a readiness probe (default is false). Then, you can also enable a liveness probe."` |
install/helm/gloo/generate/values.go
Outdated
// TODO(npolshak): Add support for GlooMtls | ||
} | ||
|
||
// GatewayProxyPodTemplate contains the Helm API available to configure the PodTemplate on the gateway Deployment | ||
type GatewayParamsPodTemplate struct { | ||
GracefulShutdown *GracefulShutdownSpec `json:"gracefulShutdown,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add descriptions to all fields
install/helm/gloo/generate/values.go
Outdated
GracefulShutdown *GracefulShutdownSpec `json:"gracefulShutdown,omitempty"` | ||
TerminationGracePeriodSeconds *int `json:"terminationGracePeriodSeconds,omitempty" desc:"Time in seconds to wait for the pod to terminate gracefully. See [kubernetes docs](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#istio-lifecycle) for more info."` | ||
Probes *bool `json:"probes,omitempty" desc:"Set to true to enable a readiness probe (default is false). Then, you can also enable a liveness probe."` | ||
CustomReadinessProbe *corev1.Probe `json:"customReadinessProbe,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the description here, we can say that if enableProbes=true and no customReadinessProbe is provided, then a default one will be created
} | ||
|
||
// Don't merge the command string as that can break the entire probe | ||
dst.Command = overrideSlices(dst.Command, src.Command) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to add a test to merge_test.go
that tests this override
projects/gateway2/helm/gloo-gateway/templates/gateway/proxy-deployment.yaml
Outdated
Show resolved
Hide resolved
command: | ||
- /bin/sh | ||
- -c | ||
- wget --post-data "" -O /dev/null 127.0.0.1:19000/healthcheck/fail; sleep {{ $gateway.gracefulShutdown.sleepTimeSeconds | default "10" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this work ok if sleepTimeSeconds is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes since a default is set
…ployment.yaml Co-authored-by: Jenny Shu <28537278+jenshu@users.noreply.github.com>
known flake #10365 |
Description
This update allows users to enable and add custom readiness and liveness probes to the Kubernetes Gloo Gateway.
This allows zero downtime when upgrading gloo or when rolling out new pods
API changes
The following helm changes have been made :
kubeGateway.gatewayParameters.glooGateway.podTemplate.terminationGracePeriodSeconds
to specify the terminationGracePeriodSeconds.kubeGateway.gatewayParameters.glooGateway.podTemplate.gracefulShutdown
to configure the graceful shutdown config for the envoy container.kubeGateway.gatewayParameters.glooGateway.podTemplate.customLivenessProbe
to specify a custom liveness probe for the envoy container.kubeGateway.gatewayParameters.glooGateway.podTemplate.livenessProbeEnabled
to enable the liveness probe. If the customLivenessProbe is not specified, no default liveness probe is set.kubeGateway.gatewayParameters.glooGateway.podTemplate.customReadinessProbe
to specify a custom readiness probe for the envoy container.kubeGateway.gatewayParameters.glooGateway.podTemplate.probes
to enable the readiness probe. If the customReadinessProbe is not specified, a default readiness probe is set. No default liveness probe is setThe following fields have been added to the GatewayParameters.Kube.PodTemplate field :
Code changes
Updated the deployer to read the new values from the gateway params and generate the appropriate helm values for the gloo gateway pod deployment
CI changes
Added the zero downtime config to all Kubernetes tests to ensure this does not break an existing feature
Docs changes
TODO in a followup with @solo-io/solo-docs
Context
https://github.com/solo-io/solo-projects/issues/7084
Interesting decisions
The helm values were created to mimic the existing values to setup the probes and zero downtime config for edge classic. Ref: https://github.com/solo-io/solo-projects/issues/7084
Testing steps
Added unit tests
Checklist: