Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods stuck in ContainerCreating (AWS CNI pod limit) #219

Closed
3 tasks done
deliahu opened this issue Jul 8, 2019 · 0 comments · Fixed by #261 or #291
Closed
3 tasks done

Pods stuck in ContainerCreating (AWS CNI pod limit) #219

deliahu opened this issue Jul 8, 2019 · 0 comments · Fixed by #261 or #291
Assignees
Labels
blocked Blocked on another task or external event bug Something isn't working

Comments

@deliahu
Copy link
Member

deliahu commented Jul 8, 2019

Description

Pod limits per node have been reached (17 for t3.medium, 11 for t3.small). The limits exist due to IP address allocation from the AWS cni plugin, see here. Once the limit is reached, cluster autoscaling is triggered, and pods that are scheduled on the new node get stuck in ContainerCreating.

Update: this was fixed in v1.5.2 of the AWS CNI.

Things that will fix:

Alternative CNI plugins

May need to run kubectl delete --namespace kube-system daemonset/aws-node before adding worker nodes to uninstall the AWS cni. May also need to start kubelet without --network-plugin=cni - otherwise kubelet may refuse to start because the configured CNI plugin cannot be brought up (aws-node container is not running). Another way to remove the AWS cni is to build a custom AMI with the desired CNI plugin prefixed with 00 instead of the standard 10 so that it circumvents the loading of the AWS VPC CNI plugin. source

Things that will help:

  • Increase default node size
  • Increase default CPU request (to reach CPU limits before pod limits)
  • Replace argo with custom DAG management
@deliahu deliahu added the bug Something isn't working label Jul 8, 2019
@deliahu deliahu added the v0.7 label Jul 9, 2019
@deliahu deliahu mentioned this issue Jul 9, 2019
@deliahu deliahu self-assigned this Jul 12, 2019
@deliahu deliahu added the blocked Blocked on another task or external event label Jul 22, 2019
@deliahu deliahu changed the title Pods stuck in ContainerCreating Pods stuck in ContainerCreating (AWS CNI) Jul 25, 2019
@deliahu deliahu changed the title Pods stuck in ContainerCreating (AWS CNI) Pods stuck in ContainerCreating (AWS CNI pod limit) Jul 25, 2019
@deliahu deliahu removed the blocked Blocked on another task or external event label Jul 26, 2019
@deliahu deliahu reopened this Jul 27, 2019
@deliahu deliahu added the blocked Blocked on another task or external event label Jul 27, 2019
@deliahu deliahu removed the v0.7 label Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Blocked on another task or external event bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant