-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sriov-network-operator
should list nodes based on label filters to decide whether to skip drain
#463
Comments
@SchSeba Hey Sebastian can you comment more on what we discussed on We are trying to determine node count by filtering based on node labels in Would like to get more clarifications on whether this is a feasible change as I think this is a bug when counting nodes. Thanks! |
we don't want to do it automatically when you have more than 1 node because the user needs to understand the implications of that. meaning when you have one node you need to handle the pods reset after a configuration. when you have multiple nodes the user must manually configure the skipDrain and understand that the implication his that he will need to reset the workloads |
doing some housekeeping closing this one please free to reopen if needed |
Currently in
sriov-network-operator
it is determining whether a cluster is a single worker node cluster by checking node count without any filtering https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/pkg/utils/cluster.go#L78 . If it is a single worker node cluster then it marks as "skip drain" https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/main.go#L249-L263 and then daemon will decide whether to "skip drain" in https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/pkg/daemon/daemon.go#L519 .Now when we are using
sriov-network-operator
in a cluster where there is only one "worker node" where we deploy data plane pods on (includingsriov-device-plugin
andsriov-network-config-daemon
, we make those pods to go to worker-node only by addingconfigDaemonNodeSelector
inSriovOperatorConfig
, see https://cloud.google.com/anthos/clusters/docs/bare-metal/latest/how-to/sriov#configure_the_sr-iov_operator), and control plane is used for kube-apiserver, the worker-node will stuck inSchedulingDisabled
after applyingSriovNetworkNodePolicy
:The reason behind is that we have
PodDisruptionBudget
for some pods on worker-node:and those pods are supposed to be scheduled on work-nodes only as control-plane nodes have taints:
which those pods don't tolerate. So when
sriov-network-config-daemon
try to drain the node and evict those pods, they don't have any other nodes to go and show the following error insriov-network-config-daemon
and it just keeps retrying so node drain will never succeed. I think for such kind of cluster settings we should just skip node drain as it will never succeed.
Given that
configDaemonNodeSelector
inSriovOperatorConfig
is already deciding which nodes to deploy sriov daemon for configs and for potential drains, I think it makes more sense to just decide whether there is only one "node" in cluster by listing node according to labels listed inconfigDaemonNodeSelector
. I am thinking of the following 2 solutions:pkg/utils/cluster.go
we modify the node list to be based on label filtering in nsos.Getenv("NAMESPACE")
sriovnetworkv1.SriovOperatorConfig nameddefault
. Then in controllers/sriovoperatorconfig_controller.go it will addDisableDrain
based on whether there is only one node.SriovOperatorConfig
CR spec by addingDisableDrain
, instead when there isDisableDrain
in operator config, we just follow that, when there is noDisableDrain
in operator config, we modify the code to still decide whether it is a single node cluster based on node label filtering, but we need to expose thisif it is a single node cluster
somewhere through CR I suppose, would like to hear people's ideas to see where we should put this information to show status of single node cluster determination.The text was updated successfully, but these errors were encountered: