-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controller is not aware of the number of available IP addresses #18
Comments
Another option: Kubelet has an option called "--max-pods" that lets you configure how many pods can be scheduled on that node. Since the max number of IP addresses per node based on its instance type is well-known, we can start Kubelet with --maxpods=[max_IP_addresses] - [number of IP addresses used by infra]". |
Problem we had with max pods is it includes ones in the host namespace as well, which don't need an IP. So a bunch of system daemonsets were taking up already limited address space without needing an address. To work around that in our AWS CNI plugin we ended up adding a taint to the node to essentially mark it as full, and deleting pods that are failing. I think my take away has always been "there's options and none of them are great". Getting kubernetes/kubernetes#20177 upstream would probably be the best outcome. |
Will investigate whether we can use (scheduler extension) kubernetes/kubernetes#13580 for this. |
One problem with the current |
@deiwin , right now, the " |
Yes, but in our experience even 30 seconds is enough to cause many CronJob failures on a cluster with high IP address utilization. We have many CronJobs that run every minute and have a short |
Note that unfortunately, the upstream issue (kubernetes/kubernetes#5507) that tracks this is in lifecycle/frozen and priority/backlog. I'm going to close this issue out with a plea to have folks comment on the upstream issue and see if we can find a path forward that makes IP addresses a concrete resource that is consumed by pods and tracked like other resources (CPU, memory, etc) |
Controller is not aware of how many IP addresses are available to be assigned to pods. It tries to assign an IP address for the new pod and then fails reactively. It should have this information proactively.
Created a 2 node
t2.medium
cluster as:Created a Deployment using the configuration file:
Scaled the replicas:
30 pods are expected in the cluster (2 * 3 * 5) but only 27 pods are available. Three pods are always in
ContainerCreating
state.A similar cluster was created with
m4.2xlarge
. 120 pods (2 * 4 * 15) are expected in the cluster, but only 109 pods are available.More details about one of the pods that is not getting scheduled:
More details about the exact steps are at https://gist.github.com/arun-gupta/87f2c9ff533008f149db6b53afa73bd0
The text was updated successfully, but these errors were encountered: