Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount issues with module aws_efs_csi_driver #1197

Closed
bla-ckbox opened this issue Nov 20, 2022 · 3 comments
Closed

Mount issues with module aws_efs_csi_driver #1197

bla-ckbox opened this issue Nov 20, 2022 · 3 comments

Comments

@bla-ckbox
Copy link

Description

When using the EFS add-on and EFS filesystem provisioned with terraform, mount fail with error, see #1171
Fix in #1191 does not seem to be sufficient, i still have the same error with in v4.17.0

Versions

  • K8S version: 1.24
  • Module version : 4.17.0 latest version currently
  • Terraform version: 1.2.8
  • Provider version(s):
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.30.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.6.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.13.1
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.1.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.0
+ provider registry.terraform.io/hashicorp/tls v3.4.0
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1
  • Module versions:
    • terraform-aws-modules/vpc/aws version 3.18.1
    • terraform-aws-modules/efs/aws version 1.0.1
  • HELM Addons versions:
    • helm-chart-aws-efs-csi-driver-2.3.2

Reproduction Code

#---------------------------------------------------------------
# EKS Blueprints
#---------------------------------------------------------------
module "eks_blueprints" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints?ref=v4.17.0"

  cluster_name    = local.cluster_name
  cluster_version = local.cluster_version

  vpc_id             = module.vpc.vpc_id
  private_subnet_ids = module.vpc.private_subnets

  managed_node_groups = {
    mg_t3a = {
      node_group_name = "managed-ondemand"
      instance_types  = ["t3a.large"]
      min_size        = 3
      max_size        = 5
      subnet_ids      = module.vpc.private_subnets
    }
  }

  tags = local.tags
}

module "eks_blueprints_kubernetes_addons" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons?ref=v4.17.0"

  eks_cluster_id       = module.eks_blueprints.eks_cluster_id
  eks_cluster_endpoint = module.eks_blueprints.eks_cluster_endpoint
  eks_oidc_provider    = module.eks_blueprints.oidc_provider
  eks_cluster_version  = module.eks_blueprints.eks_cluster_version
  eks_cluster_domain   = var.eks_cluster_domain    # For ExternalDNS Addons

  # EKS Managed Add-ons
  enable_amazon_eks_aws_ebs_csi_driver  = true
  enable_amazon_eks_coredns             = true
  enable_amazon_eks_kube_proxy          = true
  enable_amazon_eks_vpc_cni             = true

  # EKS Addons OVERWRITE
  amazon_eks_vpc_cni_config = {
    addon_version     = data.aws_eks_addon_version.latest["vpc-cni"].version
    resolve_conflicts = "OVERWRITE"
  }

  amazon_eks_coredns_config = {
    addon_version     = data.aws_eks_addon_version.latest["coredns"].version
    resolve_conflicts = "OVERWRITE"
  }

  amazon_eks_kube_proxy_config = {
    addon_version     = data.aws_eks_addon_version.latest["kube-proxy"].version
    resolve_conflicts = "OVERWRITE"
  }

  # #K8s Add-ons
  enable_argocd                       = true
  argocd_manage_add_ons               = true

  argocd_applications     = {
    addons = {
      path                = "chart"
      repo_url            = "git@gitlab.com:XXXXXX/argocds-app-of-apps/eks-blueprints-add-ons.git"
      add_on_application  = true # Indicates the root add-on application.
      ssh_key_secret_name = aws_secretsmanager_secret.argocd_gitlab_ssh_key.name
    }
  }

  enable_aws_for_fluentbit    = true
  #enable_aws_load_balancer_controller = true

  enable_cert_manager         = true
  cert_manager_domain_names   = [var.eks_cluster_domain]

  enable_aws_efs_csi_driver   = true
  enable_external_dns         = true
  enable_metrics_server       = true # Deploys Metrics Server Addon
  enable_ingress_nginx        = true

  tags = local.tags

  depends_on = [ aws_secretsmanager_secret.argocd_gitlab_ssh_key, module.eks_blueprints ]
}

#---------------------------------------------------------------
# Supporting Resources
#---------------------------------------------------------------
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 3.0"

  name = local.name
  cidr = local.vpc_cidr

  azs             = local.azs
  public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k)]
  private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 10)]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  # Manage so we can name
  manage_default_network_acl    = true
  default_network_acl_tags      = { Name = "${local.name}-default" }
  manage_default_route_table    = true
  default_route_table_tags      = { Name = "${local.name}-default" }
  manage_default_security_group = true
  default_security_group_tags   = { Name = "${local.name}-default" }

  public_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = 1
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = 1
  }

  tags = local.tags
}


module "efs" {
  source  = "terraform-aws-modules/efs/aws"
  version = "~> 1.0"

  creation_token = local.name
  name           = local.name

  # Mount targets / security group
  mount_targets = { for k, v in toset(range(length(local.azs))) :
    element(local.azs, k) => { subnet_id = element(module.vpc.private_subnets, k) }
  }
  security_group_description = "${local.name} EFS security group"
  security_group_vpc_id      = module.vpc.vpc_id
  security_group_rules = {
    vpc = {
      # relying on the defaults provdied for EFS/NFS (2049/TCP + ingress)
      description = "NFS ingress from VPC private subnets"
      cidr_blocks = module.vpc.private_subnets_cidr_blocks
    }
  }

  tags = local.tags

  depends_on = [
    module.eks_blueprints_kubernetes_addons
  ]
}

Steps to reproduce the behavior:

Create the folowing workload:

---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
mountOptions:
  - iam
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: [FileSystemId] 
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
  volumeName: efs-pv
---
apiVersion: v1
kind: Pod
metadata:
  name: app1
spec:
  containers:
  - name: app1
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date -u) >> /data/out1.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: efs-claim
---
apiVersion: v1
kind: Pod
metadata:
  name: app2
spec:
  containers:
  - name: app2
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo $(date -u) >> /data/out2.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: efs-claim

Expected behaviour

Pod should launch

Actual behaviour

Error message:

E1120 15:49:12.897427       1 mount_linux.go:184] Mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t efs -o tls fs-xxxxxxxxxx:/ /var/lib/kubelet/pods/b637f47d-14bf-4d2f-a632-e6ad2a2d2f8f/volumes/kubernetes.io~csi/efs-pv/mount
Output: Could not start amazon-efs-mount-watchdog, unrecognized init system "aws-efs-csi-dri"
b'mount.nfs4: access denied by server while mounting 127.0.0.1:/'
@bla-ckbox
Copy link
Author

I was finally able to determine that the sticking point is the default filesystem policy implemented by the module
terraform-aws-modules/efs/aws:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "NonSecureTransport",
            "Effect": "Deny",
            "Principal": {
                "AWS": "*"
            },
            "Action": "*",
            "Resource": "arn:aws:elasticfilesystem:eu-west-3:xxxxxxxxxxxxxxx:file-system/fs-xxxxxxxxxxxxxx",
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        }
    ]
}

accordingly, I tried enabling transport encryption, (reference: https://github.com/kubernetes-sigs/aws-efs-csi-driver/tree/master/examples/kubernetes/encryption_in_transit)

---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
mountOptions:
  - iam
  - tls
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: [FileSystemId] 
    volumeAttributes:
      encryptInTransit: "true"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
  volumeName: efs-pv

without success because I always come back to my initial error mount.nfs4: access denied by server while mounting 127.0.0.1:/ ....

In the meantime I found a solution at the level of the creation of the EFS volume by adding attach_policy = false in the module "efs" part.

That being said I still consider that the right way to do would be to activate EFS + TLS! Someone would tell me: if there is something missing in what I did? or if it is really a bug ??

@bla-ckbox
Copy link
Author

I close the ticket, my conclusion is that the policy injected by the terraform-aws-modules/efs/aws module is not compatible with a mount established in tls with the stunnel binary of aws-efs-csi-driver

casualuser added a commit to casualuser/terraform-aws-eks-blueprints that referenced this issue Sep 11, 2023
@casualuser
Copy link

hello here from September 2023

the stateful example in the repo still broken and doesn't allow to mount efs shared storage neither to nodes nor to pods

proposed solution by @bla-ckbox with attach_policy = false in the module "efs" part at least allow to launch example successfully and investigation of reasons requires a lot of effort to find this issue and try workaround

let's make attach_policy = false added to stateful example to make it working?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants