Add ability to override maxWaitForUnmountDuration for attach detach controller in controller manager #129805
Labels
kind/feature
Categorizes issue or PR as related to a new feature.
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
sig/storage
Categorizes an issue or PR as relevant to SIG Storage.
What would you like to be added?
When node non graceful shutdown occur kube controller manager updates taints on node and sets
node.kubernetes.io/unreachable
taint.When pod's
tolerationSeconds
expired controller manager evict pod from node. I settolerationSeconds: 60
for my pod and it's get evicted in time. But cannot start because I use storage with RWO and volume must be detached from failed node first. So controller manager tries to detach volume and after 6 minutes logsI know that
NodeOutOfServiceVolumeDetach
feature gate in GA since kubernetes 1.28.With that logic I have to update node with
node.kubernetes.io/out-of-service=nodeshutdown:NoExecute
taint to get volume attached on new node and delete from failed node.As I see the maxWaitForUnmountDuration is hardcoded to 6 minutes in code
Why is this needed?
I think that's common case.
My goal is to make fast migration of pod in kubernetes, I can't run this pod in multiple replicas.
But for that I need to write special controller that will taint nodes on fail. And this controller will just mark node out of service after some time it gets Unreachable.
The text was updated successfully, but these errors were encountered: