Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Waiting for rollout to finish: 0 of 1 updated replicas are available... #6

Closed
krismorte opened this issue Apr 8, 2020 · 10 comments

Comments

@krismorte
Copy link

First question is can I run this on a EKS+ fargate 😄

I'm facing this error

Error: Waiting for rollout to finish: 0 of 1 updated replicas are available...

  on .terraform/modules/cluster.alb_ingress_controller/terraform-kubernetes-alb-ingress-controller-3.1.0/main.tf line 309, in resource "kubernetes_deployment" "this":
 309: resource "kubernetes_deployment" "this" {

The weird behavior is that after I got the error if a ran a Plan i got this object to be replaced

-/+ resource "kubernetes_deployment" "this" {
      ~ id = "crawler/aws-alb-ingress-controller" -> (known after apply)

      ~ metadata {
            annotations      = {
                "field.cattle.io/description" = "AWS ALB Ingress Controller"
            }
          ~ generation       = 1 -> (known after apply)
            labels           = {
                "app.kubernetes.io/managed-by" = "terraform"
                "app.kubernetes.io/name"       = "aws-alb-ingress-controller"
                "app.kubernetes.io/version"    = "v1.1.6"
            }
            name             = "aws-alb-ingress-controller"
            namespace        = "crawler"
          ~ resource_version = "376158" -> (known after apply)
          ~ self_link        = "/apis/apps/v1/namespaces/crawler/deployments/aws-alb-ingress-controller" -> (known after apply)
          ~ uid              = "3e654aec-57f3-4832-8629-039adfb44482" -> (known after apply)
        }

      ~ spec {
            min_ready_seconds         = 0
            paused                    = false
            progress_deadline_seconds = 600
            replicas                  = 1
            revision_history_limit    = 10

            selector {
                match_labels = {
                    "app.kubernetes.io/name" = "aws-alb-ingress-controller"
                }
            }

          ~ strategy {
              ~ type = "RollingUpdate" -> (known after apply)

              ~ rolling_update {
                  ~ max_surge       = "25%" -> (known after apply)
                  ~ max_unavailable = "25%" -> (known after apply)
                }
            }

          ~ template {
              ~ metadata {
                    annotations      = {
                        "iam.amazonaws.com/role" = "arn:aws:iam::333374442078:role/k8s-rooftop-cluster-stage-alb-ingress-controller"
                    }
                  ~ generation       = 0 -> (known after apply)
                    labels           = {
                        "app.kubernetes.io/name"    = "aws-alb-ingress-controller"
                        "app.kubernetes.io/version" = "1.1.6"
                    }
                  + name             = (known after apply)
                  + resource_version = (known after apply)
                  + self_link        = (known after apply)
                  + uid              = (known after apply)
                }

              ~ spec {
                  - active_deadline_seconds          = 0 -> null
                  - automount_service_account_token  = false -> null
                    dns_policy                       = "ClusterFirst"
                    host_ipc                         = false
                    host_network                     = false
                    host_pid                         = false
                  + hostname                         = (known after apply)
                  + node_name                        = (known after apply)
                  - node_selector                    = {} -> null
                    restart_policy                   = "Always"
                    service_account_name             = "aws-alb-ingress-controller"
                    share_process_namespace          = false
                    termination_grace_period_seconds = 60

                  ~ container {
                        args                     = [
                            "--ingress-class=alb",
                            "--cluster-name=rooftop-cluster-stage",
                            "--aws-vpc-id=vpc-03a0cc5578bdbfb97",
                            "--aws-region=us-east-2",
                            "--aws-max-retries=10",
                        ]
                      - command                  = [] -> null
                        image                    = "docker.io/amazon/aws-alb-ingress-controller:v1.1.6"
                        image_pull_policy        = "Always"
                        name                     = "server"
                        stdin                    = false
                        stdin_once               = false
                        termination_message_path = "/dev/termination-log"
                        tty                      = false

                      ~ liveness_probe {
                            failure_threshold     = 3
                            initial_delay_seconds = 60
                            period_seconds        = 60
                            success_threshold     = 1
                            timeout_seconds       = 1

                          ~ http_get {
                                path   = "/healthz"
                                port   = "health"
                                scheme = "HTTP"
                            }
                        }

                      ~ port {
                            container_port = 10254
                          - host_port      = 0 -> null
                            name           = "health"
                            protocol       = "TCP"
                        }

                      ~ readiness_probe {
                            failure_threshold     = 3
                            initial_delay_seconds = 30
                            period_seconds        = 60
                            success_threshold     = 1
                            timeout_seconds       = 3

                          ~ http_get {
                                path   = "/healthz"
                                port   = "health"
                                scheme = "HTTP"
                            }
                        }

                      ~ resources {
                          + limits {
                              + cpu    = (known after apply)
                              + memory = (known after apply)
                            }

                          + requests {
                              + cpu    = (known after apply)
                              + memory = (known after apply)
                            }
                        }

                      ~ volume_mount {
                            mount_path        = "/var/run/secrets/kubernetes.io/serviceaccount"
                            mount_propagation = "None"
                            name              = "aws-alb-ingress-controller-token-qsfhm"
                            read_only         = true
                        }
                    }

                  + image_pull_secrets {
                      + name = (known after apply)
                    }

                  ~ volume {
                        name = "aws-alb-ingress-controller-token-qsfhm"

                      ~ secret {
                            default_mode = "0644"
                          - optional     = false -> null
                            secret_name  = "aws-alb-ingress-controller-token-qsfhm"
                        }
                    }
                }
            }
        }
    }
@headcr4sh
Copy link
Collaborator

headcr4sh commented Apr 8, 2020

Seems weird to me.
Which version of

  • Terraform
  • terraform-provider-kubernetes

are you using?

@krismorte
Copy link
Author

Yeah, it's weird. Here is my config

Terraform v0.12.24
+ provider.aws v2.55.0
+ provider.kubernetes v1.11.1
+ provider.local v1.4.0
+ provider.tls v2.1.1

@headcr4sh
Copy link
Collaborator

Ok.
That's the most recent versions which we are using -- without any problems -- as well.
I therefore assume it's not an issue with either terraform or the kubernetes provider plugin.

Would you mind posting the configuration options that you are using for the module?
And one more thing: which version of Kubernetes are you using? EKS or non-EKS?

@krismorte
Copy link
Author

My settup is this, but EKS is on the last version

resource "aws_eks_cluster" "eks_cluster" {
  name     = local.eks_cluster_name
  role_arn = aws_iam_role.eks_iam.arn

  vpc_config {
    subnet_ids = data.aws_subnet_ids.subnets.ids
  }

  enabled_cluster_log_types = ["api", "audit","controllerManager","scheduler"]

  # Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling.
  # Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups.
  depends_on = [
    aws_iam_role_policy_attachment.eks_iam_AmazonEKSClusterPolicy,
    aws_iam_role_policy_attachment.eks_iam_AmazonEKSServicePolicy,
    aws_cloudwatch_log_group.eks_cluster_cloudwatch,
  ]
}
resource "aws_eks_fargate_profile" "eks_fargate_prof" {
  cluster_name           = local.eks_cluster_name
  fargate_profile_name   = "fargate-profile-crwaler"
  pod_execution_role_arn = aws_iam_role.eks_fargate_iam.arn
  subnet_ids             = data.aws_subnet_ids.private_subnets.ids

  selector {
    namespace = "crawler"
  }
}

@headcr4sh
Copy link
Collaborator

Mh. Seems alright. How did you configure the ALB module?

@headcr4sh
Copy link
Collaborator

Oh. I didn't read your first sentence...

First question is can I run this on a EKS+ fargate

I'm not sure if it makes sense to run the ALB Ingress Controller on Fargate. It should be possible, but I think that a proper nodeSelector/nodeAffinity and the right toleration must be configured then.

It should be possible, but I don't think it makes sense to run any of the basic infrastructure controllers on a serverless platform such as Fargate. Therefore I suggest you create a dedicated non-Fargate NodeGroup to host Infrastructure-components (such as a metrics-server, kube-state-metrics, external-dns, cert-manager, alb-ingress-controller, jaeger, prometheus and stuff like that,...)

@krismorte
Copy link
Author

Yeah, I kind had this suspicious. I'm following this tutorial below but convert everything to terraform

https://eksworkshop.com/beginner/180_fargate/prerequisites-for-alb/

I will try to add a group node just for the infra and give a shot. If everything works I close this issue with the new config.

@kadaffy
Copy link

kadaffy commented Apr 9, 2020

Hello, I'm having the same issue but I'm not using Fargate

module.alb_ingress_controller.kubernetes_deployment.this: Still creating... [9m40s elapsed]
module.alb_ingress_controller.kubernetes_deployment.this: Still creating... [9m50s elapsed]
module.alb_ingress_controller.kubernetes_deployment.this: Still creating... [10m0s elapsed]

Error: Waiting for rollout to finish: 0 of 1 updated replicas are available...

  on .terraform/modules/alb_ingress_controller/main.tf line 309, in resource "kubernetes_deployment" "this":
 309: resource "kubernetes_deployment" "this" {

Kubernetes Version: 1.15
Kubernetes Provider: ~> 1.6
Terraform v0.12.17

I'm using this module to create the eks cluster https://github.com/terraform-aws-modules/terraform-aws-eks

@deniojunior
Copy link

deniojunior commented Jul 1, 2020

I had the same issue, my logs says:

caused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.us-east-1.amazonaws.com/id/9B187AEBC38B819CC086530DB01E5A86
	status code: 400, request id: b08f60c0-5e20-4736-8ad0-cfc921bdb00a

According to AWS documentation you must create an IAM OIDC provider and associate it with your cluster:

Create an IAM OIDC provider and associate it with your cluster. If you don't have eksctl version 0.22.0 or later installed, complete the instructions in Installing or upgrading eksctl to install or upgrade it. You can check your installed version with eksctl version.

Maybe if you add a aws_iam_openid_connect_provider resource into the module this issue could be solved.

@deniojunior
Copy link

I just found out that if you are using the module terraform-aws-eks to create the EKS cluster, all you have to do is add the option to create the OpenIDConnect provider by adding:

enable_irsa  = true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants