-
Notifications
You must be signed in to change notification settings - Fork 519
Workloads are not scheduled to one of two worker nodes - race condition at setup? #3531
Comments
Hi @DonMartin76. It looks like node labels aren't applied correctly, causing |
I do not have the cluster around, but I do have the node 0:
node 1:
|
I wasn't able to repro it as well. @aramase will the lack of Secret Store CSI annotations affect pod scheduling? |
No, the annotation doesn't affect pod scheduling. In fact, that annotation |
I am a collegue of @DonMartin76. One thing I noticed is that the faulty cluster has the following line in its logs:
while the working cluster looks like this:
maybe the order is important. Note that in the upper logs |
@Jabb0 I don't think the order matters. The westeurope-0 and westeurope-1 are referring to the fault domain of the VM availability set, which shouldn't affect pod scheduling. Based on the kube-scheduler log, the scheduling error is either |
@chewong
Here the ports are not an issue. However it only explains 3 out of 4 nodes. Is that intentional?
Is it necessary for the any pods (e.g. the CSI ones) to run on the node before any pod can be scheduled? |
@Jabb0 I think it's intentional, the 3 nodes are your helper & worker nodes, the remaining node should be master, which some pods can't be scheduled to. Those pods are just part of an add-on that aks-engine deploys after the cluster is provisioned, which should not affect pod scheduling at all. I suspect that the kubelet or even the underlying VMs are experiencing some problems. If you encounter this problem again, could you share the kubelet log of the problematic node? See https://github.com/Azure/aks-engine/blob/master/docs/howto/troubleshooting.md for how to collect kubelog log. |
Should we leave this issue open? Are we still repro'ing? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
We are seeing this issue with Kubernetes 1.18.6 on an intermittent basis during an install (k8s plus services running on the cluster). Our workaround has been to reboot the master node on which the lead kube-scheduler is running, and had one instance where we had to do that twice to get through our initial install. We're running with three physical nodes, and each physical node has 1 master VM and 1 worker VM. We are using a stacked master/etcd. We bring up the first physical node and get k8s running, along with OpenEBS to provide persistent storage. While the second physical node is coming up, and the master/worker VMs are joining the cluster, the original kube-scheduler container restarts. The new leader for the kube-scheduler is the one that stops scheduling on one of the worker nodes. I've added "-v=4" to the kube-scheduler to get more logs. For the node that isn't getting scheduled, I see only these two references to it in the logs:
So, the kube-scheduler was aware that this node exists. Another interesting set of log messages is:
Note that |
Describe the bug
I saw the following behaviour today which I have never seen before: In a pretty simple 2+1+1 node cluster (2 workers, 1 helper for utilities, 1 master), Kubernetes would not schedule any normal pods to the second worker node; this resulted in the first worker node being swamped and soon rejecting further pods due to CPU limitations.
Please note that we are using this setup in production since very long ago, at least 18 months, and this I have never seen. We quite recently updated to the latest aks-engine version, and we are running Kubernetes 1.18.3 with it.
What I noticed when comparing the output of
kubectl describe node
for both nodes is that one of the nodes (the one where nothing gets scheduled) is missing the following annotation:On a functioning cluster, both worker nodes have this annotation. Other than this, I cannot find anything which looks odd in the node description.
Here's also an excerpt from the logs of the
kube-scheduler
(directly from the docker container on the master node):This may look like the missing annotation may rather be an effect of the underlying problem, and not the problem as such.
Steps To Reproduce
The tricky part is that I cannot reproduce this anymore. The next cluster which was created worked nicely again - and it had the correct annotations. I do not know whether those annotations are related, but it's the only thing I could find which differed between the nodes.
This is the API model we used:
Expected behavior
All nodes in the "worker" node pool should get workloads scheduled.
AKS Engine version
0.52.0
Kubernetes version
1.18.3
Additional context
To me, this looks like it's some kind of race condition at provisioning; it does not seem to happen often, but now it did, and it makes the cluster behave weirdly.
The text was updated successfully, but these errors were encountered: