You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be helpful if the status indicators for pods during a rollout (e.g. these)
were more fine grained. Specifically, they should take into account the readiness state of the app itself.
We had an issue where a rollout started, and new pods came up. They were scheduled and live, as far as k8s was concerned, but they were not yet ready according to the readiness configuration. Regardless though, the UI showed the pods as green and operators (incorrectly) assumed that everything was fine. This status, combined with the fact that Argo Rollouts frequently gets stuck, led the human operators to draw the wrong conclusion (they assumed argo rollouts was stuck) and they did a full promote. For various, here unimportant, reasons, that caused an outage.
Summary
It would be helpful if the status indicators for pods during a rollout (e.g. these)
were more fine grained. Specifically, they should take into account the readiness state of the app itself.
We had an issue where a rollout started, and new pods came up. They were scheduled and live, as far as k8s was concerned, but they were not yet ready according to the
readiness
configuration. Regardless though, the UI showed the pods as green and operators (incorrectly) assumed that everything was fine. This status, combined with the fact that Argo Rollouts frequently gets stuck, led the human operators to draw the wrong conclusion (they assumed argo rollouts was stuck) and they did a full promote. For various, here unimportant, reasons, that caused an outage.Perhaps the UI can take into account all the possible states for a pod (https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) and expose them to the users.
Use Cases
All the time.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: