Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trouble using --aws-role-arn option when adding EKS cluster with argocd CLI #2347

Closed
jeremyhermann opened this issue Sep 23, 2019 · 62 comments
Closed
Labels
bug Something isn't working component:multi-cluster Features related to clusters management workaround There's a workaround, might not be great, but exists

Comments

@jeremyhermann
Copy link

I am trying to use the --aws-role-arn option when adding an EKS cluster to ArgoCD, as described in #1304. I have not been able to get it to work and the error messages are difficult to interpret and I am not sure how to debug.

  • I have ArgoCD running in one AWS account and my EKS cluster is in another AWS account

  • I have set up the acme-production-deploy-role so that it can be assumed both by the AWS role that I am using to run argocd cluster add ... and by the EC2 instances in my ArgoCD cluster (I am confused about which IAM identity is used to assume the role so I tried to allow both to work).

  • Here is what I see when I try to add the cluster. (I have redacted the AWS account numbers and the EKS id, but confirmed that I used the correct values for these):

$ argocd cluster add acme-production --aws-cluster-name arn:aws:eks:us-west-2:<account-number>:cluster/acme-production --aws-role-arn arn:aws:iam::<account-number>:role/acme-production-deploy-role

FATA[0000] rpc error: code = Unknown desc = REST config invalid: Get https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com/version?timeout=32s: getting credentials: exec: exit status 1 

Note that I am able to successfully add the cluster using argocd cluster add https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com

Thanks

@jeremyhermann jeremyhermann added the bug Something isn't working label Sep 23, 2019
@alexec alexec added the workaround There's a workaround, might not be great, but exists label Oct 2, 2019
@sriddell
Copy link

sriddell commented Oct 3, 2019

I'm not sure of the minimal permissions necessary, but I was able to get argocd to add an external cluster (cross account) by doing the following:

In account B (external cluster), created a role argocd-test with AdministratorAccess policy attached, and created a trust relationship between it and account A (running argocd in an eks cluster).

In the external cluster, 'kubectl edit -n kube-system configmap/aws-auth' to edit the iam role to RBAC mappings, under mapped roles added:

    - rolearn: arn:aws:iam::accountB_number:role/argocd-test
      username: arn:aws:iam::accountB_number:role/argocd-test
      groups:
        - system:bootstrappers
        - system:nodes

Note that the username has to be the same as the rolearn; if not, when trying to add the external cluster to argocd, will get a 'the server has asked for the client to provide credentials' error

In account A, that's running argocd, attached a policy allowing assumption of the argocd-test role in account B to the iam role on the EC2 nodes running argocd. (May be possible to map the role to a service account used by the argocd pods; I didn't test whether argocd is using a new enough aws-sdk for that support yet.)

{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": "sts:AssumeRole",
    "Resource": "arn:aws:iam::accountB_number:role/argocd-test"
  }
}

In my kubeconfig, the external cluster in account B is registered as 'arn:aws:eks:us-east-1:accountB_number:cluster/kube_remote'.
With argocd logged in to the cluster in account A, the following command to add the external cluster in the other account was accepted:

argocd cluster add arn:aws:eks:us-east-1:accountB_number:cluster/kube_remote --aws-role-arn arn:aws:iam::accountB_number:role/argocd-test --aws-cluster-name kube_remote

Note that both --aws-role-arn and --aws-cluster-name are needed to make argocd use the aws-iam-authenticator to assume the role in account B. Also, --aws-cluster-name has to be just the name of the external cluster - not the full arn.

I am using Kube 1.14 in EKS, and argocd 1.2.2.

@jannfis jannfis added the component:multi-cluster Features related to clusters management label May 14, 2020
@carlosjgp
Copy link

I have the same issue but I'm using terraform to create the secret just after the cluster has been created but I can't get it working

I followed this structure
https://github.com/argoproj/argo-cd/blob/master/docs/operator-manual/declarative-setup.md#clusters

resource "kubernetes_secret" "dev-green-argocd" {
  # this secret gets deployed on CI cluster
  # to allow ArgoCD to access the new cluster
  provider = kubernetes.ci

  metadata {
    labels = {
      "argocd.argoproj.io/secret-type" = module.cluster_green.eks_cluster.name
    }
    name      = module.cluster_green.eks_cluster.name
    namespace = "argocd"
  }

  data = {
    server = module.cluster_green.eks_cluster.endpoint
    name   = module.cluster_green.eks_cluster.name
    config = <<CONFIG
{
    "awsAuthConfig": {
        "clusterName": "${module.cluster_green.eks_cluster.name}",
        "roleARN": "${module.base.admin_role.arn}"
    },
    "tlsClientConfig": {
        "insecure": false,
        "caData": "${module.cluster_green.eks_cluster.certificate_authority.0.data}"
    }
}
CONFIG
  }
}

but I can't find any logs related to detecting this new secret, cluster name, cluster endpoint on the repo-server, server or application-controller

I was also trying to guess how does ArgoCD uses aws-iam-authenticator and which component sends the request to the other AWS clusters/accounts
I've inferred from some tickets and conversations that is the server component the one in charge to send the request so this component should have some sort of AWS SDK setup somewhere...

  • aws/config files
  • environment variables
  • IAM Role attached to the PODs

But this it not documented and I'm struggling to automate this process

I'd happily document (crating a PR) the IAM configuration required if I could understand how all this works

@jurgenweber
Copy link

when I try to add the cluster on the CLI, I am seeing the following in the argocd-server logs:

argocd-server-54cd8c56d9-lw7ff argocd-server 2020-06-12T01:43:54.914385589Z An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::1111:assumed-role/argocd-prod-workers-role/i-2222 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::1111:role/aws20200611003524576400000005

need to know how to setup a role that can be assumed by a role!

@jurgenweber
Copy link

jurgenweber commented Jun 12, 2020

Ok, so if you are doing this in the same account, here is what I did. I created a role to put in my target clusters aws-auth configmap.... Like so:

    - "groups":
      - "system:bootstrappers"
      - "system:nodes"
      "rolearn": "arn:aws:iam::my account number:role/ArgoCDTest"
      "username": "arn:aws:iam::my account number:role/ArgoCDTest"

This role trusts the root, eg:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::my account number:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

The worker node role of the argocd cluster need to have STS:assumeRole permissions to this new role.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sts:*",
            "Resource": "arn:aws:iam::my account number:role/ArgoCDTest"
        }
    ]
}

This has now gotten me passed the AWS error above, but now I have a new error:

FATA[0006] rpc error: code = Unknown desc = REST config invalid: the server has asked for the client to provide credentials

@jurgenweber
Copy link

jurgenweber commented Jun 12, 2020

ok, it looks like this error is a 'user issue', eg: my fault. ;) I am trying to add the cluster like so:

argocd cluster add arn:aws:eks:ap-southeast-2:my account number:cluster/my-cluster-name --aws-role-arn "arn:aws:iam::my account number:role/ArgoCDTest" --aws-cluster-name kube_remote

the --aws-cluster-name was incorrect, should be 'my-cluster-name', not 'kube_remote'. Once I discovered this:

$ argocd cluster add arn:aws:eks:ap-southeast-2:my account number:cluster/my-cluster-name --aws-cluster-name my-cluster-name --aws-role-arn "arn:aws:iam::my account number:role/ArgoCDTest"
Cluster 'https://long and anonymous string.yl4.ap-southeast-2.eks.amazonaws.com' added

and in the argocd logs:

argocd-server-54cd8c56d9-lw7ff argocd-server 2020-06-12T02:45:02.465689783Z time="2020-06-12T02:45:02Z" level=info msg="Alloc=10468 TotalAlloc=5311811 Sys=72272 NumGC=1253 Goroutines=153"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.49700286Z time="2020-06-12T02:45:29Z" level=info msg="Starting configmap/secret informers"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.497064166Z time="2020-06-12T02:45:29Z" level=info msg="configmap informer cancelled"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.497074411Z time="2020-06-12T02:45:29Z" level=info msg="secrets informer cancelled"
argocd-server-54cd8c56d9-lw7ff argocd-server 2020-06-12T02:45:29.498149611Z time="2020-06-12T02:45:29Z" level=info msg="Notifying 1 settings subscribers: [0xc0008449c0]"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.597015242Z time="2020-06-12T02:45:29Z" level=info msg="Configmap/secret informer synced"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.597607321Z time="2020-06-12T02:45:29Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Create grpc.service=cluster.ClusterService grpc.start_time="2020-06-12T02:45:26Z" grpc.time_ms=3573.327 span.kind=server system=grpc

but!

in the UI it is still failed:
image

but I can see no other sign to why it is failed.

@jurgenweber
Copy link

ok, I have it working now... I updated my 1.6 install to 'latest'.. it works in master/latest.

@jurgenweber
Copy link

antoher thing of note, I tried not using the IAM role of the nodes but using the OIDC features in EKS:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::1111:oidc-provider/oidc.eks.ap-south-1.amazonaws.com/id/1111"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.ap-south-1.amazonaws.com/id/1111:sub": "system:serviceaccount:argocd:argocd-server"
        }
      }
    }
  ]
}

I still received the same error:

An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::1111:assumed-role/operational-1111/i-090dcd405f3f70755 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::1111:role/argocd-manager-k8s00-dev-aws

Maybe this can be supported?

@helenabutowatcisco
Copy link

@jurgenweber any luck in getting this to work with a ServiceAccount?

@dshackith
Copy link

I am struggling with the same thing. EKS with ArgoCD in one cluster, wanting to add external EKS cluster. I have setup a role in IAM that has an attached trust policy that grants trust via OICD to the two service accounts used by ArgoCD (server and app-controller). These service accounts have been annotated with the IAM role. In the external cluster the role has been added to auth-cm. The pods are already using the role that has rights in the external cluster, so no need to assume a role. I am using declarative secrets for adding the external clusters, and not specifying the ARN for the role (as it does not need to assume one). The clusters show up in the UI, but as failed. No messages in log on argo-cd-server pod.
The IAM Roles for Service Accounts setup I am using seems to be successful in the sense that the pod has ENV variables for AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE and the token file is readable and has a token (I did have to set fsGroup to 999 globally).

@dshackith
Copy link

From the argo-server pod I am able to use use aws eks get-token (command used by ArgoCD) to retrieve a token from the external cluster.

@helenabutowatcisco
Copy link

Here's a rough recipe for what I've managed to get working with IAM roles scoped to a Kubernetes ServiceAccount and a single AWS account:

  1. use the latest tag for the images. I couldn't get this to work with v1.6.
  2. you'll need to set the fsGroup for the securityContext to 999.
  3. setup the IAM OpenID Connect provider for the EKS cluster running Argo CD by following https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
  4. create one IAM policy and one role following https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html
    • there needs to be 2 trust relationships for the one role: one for the argocd-server ServiceAccount and one for the argocd-application-controller ServiceAccount
    • the policy should allow AssumeRole and AssumeRoleWithWebIdentity for the STS service with the resources for the policy limited to the ARN for the IAM role
  5. you'll need to add an annotation to the argocd-server and argocd-application-controller ServiceAccounts following https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html
  6. for the EKS clusters that Argo CD will deploying apps into, add a new entry into the aws-auth ConfigMap by following https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
    • the rolearn and username values should match (though I'm not certain that this is required)
    • for groups, you could use system:masters but you might want to restrict scope further
  7. for the Secret for each cluster, follow https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#clusters
    • be sure to add the label to the Secret as shown in the docs
    • for the Secret data:
      • name should be the ARN of the EKS cluster
      • server should be the URL of the Kubernetes API for the EKS cluster.
      • in the config block only set the following:
        • awsAuthConfig where:
          • clusterName is the name of the EKS cluster as returned by aws eks list-clusters
          • roleARN is the ARN of the IAM role
        • tlsClientConfig where:
          • insecure is false. This might not be required but it doesn't hurt to be explicit.
          • caData is the certificate returned by aws eks describe-cluster --query "cluster.certificateAuthority" --output text. It should already be base64 encoded.
  8. when creating an app with argocd app create, set --dest-server to the URL of the Kubernetes API for the cluster.

@tbondarchuk
Copy link

@helenabutowatcisco many thanks for the instructions! Worked for me even with external cluster in another account, just had to create argocd-manager role in external account, add it to aws-auth and allow assume from argocd IRSA role in argocd's account.

Worth noting that after creating Secret with cluster's info, it's status becames Unknown but it's actually works, just need to create app on it, then it'll change to Successful. Don't spend time debugging connection like I did :)

@tbondarchuk
Copy link

Looks like IRSA roles work even with 1.6.2 images. Found out while experimenting and switching between latest and 1.6.2. I suspect that after adding SA annotation with role name pods has to be recreated to pick up changes.

latest image appears to have some issues with resource tracking and constantly shows apps out of sync. One example: cert-manager deployed from proxy helm chart with cluster issuer created from manifest in templates folder (can share details if anybody interested).

Important note on multi-cluster management: application names have to be unique in argocd, so to deploy same app to multiple clusters different app names has to be used. Solution that worked for me:

  • use custom label key in argocd-cm application.instanceLabelKey: argocd.argoproj.io/instance
  • create applications with cluster name suffix cert-manager-stage, cert-manager-poc
  • use helm release name to force argocd to create resources without cluster name suffix (don't know yet how to do the same with kustomized manifests, but should be possible I think)
  • use different projects per cluster

Final configs (essentials):

argocd/kustomization.yaml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: argocd
bases:
  - github.com/argoproj/argo-cd/manifests/cluster-install?ref=v1.6.2
patchesStrategicMerge:
  - overlays/argocd-application-controller-deployment.yaml
  - overlays/argocd-application-controller-sa.yaml
  - overlays/argocd-cm.yaml
  - overlays/argocd-server-deployment.yaml
  - overlays/argocd-server-sa.yaml.yaml

overlays/argocd-application-controller-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: argocd-application-controller
spec:
  template:
    spec:
      securityContext:
        fsGroup: 999

overlays/argocd-application-controller-sa.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: argocd-application-controller
  annotations:
    eks.amazonaws.com/role-arn: ARGOCD_ROLE

overlays/argocd-cm.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
data:
  application.instanceLabelKey: argocd.argoproj.io/instance

overlays/argocd-server-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: argocd-server
spec:
  template:
    spec:
      securityContext:
        fsGroup: 999

overlays/argocd-server-sa.yaml.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: argocd-server
  annotations:
    eks.amazonaws.com/role-arn: ARGOCD_ROLE

Where ARGOCD_ROLE is IRSA IAM role for argocd's cluster with permissions to assume target clusters admin user IAM roles

App config example:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  namespace: argocd
  name: cert-manager-{{ .Values.spec.destination.clusterName }} # per-cluser App name
  finalizers:
    - resources-finalizer.argocd.argoproj.io

  destination:
    namespace: kube-system
    server: {{ .Values.spec.destination.server }} # dest cluster

  project: {{ .Values.spec.project }} # per-cluster project

  source:
    path: applications/cert-manager
    repoURL: {{ .Values.spec.source.repoURL }}
    targetRevision: {{ .Values.spec.source.targetRevision }}
    helm:
      releaseName: cert-manager # ensure actual resources won't have cluster name suffix

@musabmasood
Copy link

I think there should be better documentation on how to properly configure awsAuthConfig

@musabmasood
Copy link

musabmasood commented Aug 14, 2020

Here's a rough recipe for what I've managed to get working with IAM roles scoped to a Kubernetes ServiceAccount and a single AWS account:

  1. use the latest tag for the images. I couldn't get this to work with v1.6.

  2. you'll need to set the fsGroup for the securityContext to 999.

  3. setup the IAM OpenID Connect provider for the EKS cluster running Argo CD by following https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

  4. create one IAM policy and one role following https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html

    • there needs to be 2 trust relationships for the one role: one for the argocd-server ServiceAccount and one for the argocd-application-controller ServiceAccount
    • the policy should allow AssumeRole and AssumeRoleWithWebIdentity for the STS service with the resources for the policy limited to the ARN for the IAM role
  5. you'll need to add an annotation to the argocd-server and argocd-application-controller ServiceAccounts following https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html

  6. for the EKS clusters that Argo CD will deploying apps into, add a new entry into the aws-auth ConfigMap by following https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html

    • the rolearn and username values should match (though I'm not certain that this is required)
    • for groups, you could use system:masters but you might want to restrict scope further
  7. for the Secret for each cluster, follow https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#clusters

    • be sure to add the label to the Secret as shown in the docs

    • for the Secret data:

      • name should be the ARN of the EKS cluster

      • server should be the URL of the Kubernetes API for the EKS cluster.

      • in the config block only set the following:

        • awsAuthConfig where:

          • clusterName is the name of the EKS cluster as returned by aws eks list-clusters
          • roleARN is the ARN of the IAM role
        • tlsClientConfig where:

          • insecure is false. This might not be required but it doesn't hurt to be explicit.
          • caData is the certificate returned by aws eks describe-cluster --query "cluster.certificateAuthority" --output text. It should already be base64 encoded.
  8. when creating an app with argocd app create, set --dest-server to the URL of the Kubernetes API for the cluster.

So you were able to make it work with just one role for ArgoCD SA? You didn't have to create any roles for the clusters that ArgoCD was managing?

@helenabutowatcisco
Copy link

So you were able to make it work with just one role for ArgoCD SA? You didn't have to create any roles for the clusters that ArgoCD was managing?

Correct. Just one IAM role.

@musabmasood
Copy link

So you were able to make it work with just one role for ArgoCD SA? You didn't have to create any roles for the clusters that ArgoCD was managing?

Correct. Just one IAM role.

We are struggling to make it work, the cluster just shows up as failed in the UI but there is no logs and no way to enable verbose logging. Not sure what is wrong. Our current setup:

  • ArgoCD cluster and destination cluster are in separate AWS accounts
  • OIDC asssumable role created for argocd in the AWS with argocd
  • aws-auth configMap edited and added the following:
    - "groups":
      - "system:masters"
      "rolearn": "arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd"
      "username": "arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd"
  • added the annotation to both SA for argocd-application-controller and argocd-server eks.amazonaws.com/role-arn: arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd
  • Created the cluster secret as follow:
apiVersion: v1
kind: Secret
metadata:
  name: my-cluster
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: custom-name-for-the-cluster
  server: <TARGET_CLUSTER_API_SERVER_ADDRESS>
  config: |
    {
      "insecure": false,
      "awsAuthConfig": {
          "clusterName": "<REAL_NAME_OF_THE_TARGET_CLUSTER>",
          "roleARN": "eks.amazonaws.com/role-arn: arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd"
      },
      "tlsClientConfig": {
        "caData": "xxx"
      }
    }

Not sure what we are missing 😢

@maxbrunet
Copy link
Contributor

maxbrunet commented Aug 14, 2020

Tackling the issue with @musabmasood, I have copied this kubeconfig in the argocd-server and argocd-application-server:

# /home/argocd/.kube/config
apiVersion: v1
kind: Config
preferences: {}
clusters:
  - cluster:
        certificate-authority-data: <base64-encoded ca-data>
        server: https://<eks-cluster-specific-host>.eks.amazonaws.com
    name: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
users:
  - name: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
    user:
        exec:
            apiVersion: client.authentication.k8s.io/v1alpha1
            args:
              - eks
              - get-token
              - --cluster-name
              - <cluster-name>
            command: aws
contexts:
  - context:
        cluster: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
        user: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
    name: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>

And I am able to list pods in the target cluster:

argocd@argocd-server-5cc5c44949-8rqhc:~$ kubectl cluster-info --context arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
Kubernetes master is running at https://<eks-cluster-specific-host>.eks.amazonaws.com
CoreDNS is running at https://<eks-cluster-specific-host>.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://<eks-cluster-specific-host>.eks.amazonaws.com/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

But with the cluster secret which apparently does the same, it does not work:

if c.Config.AWSAuthConfig != nil {
args := []string{"eks", "get-token", "--cluster-name", c.Config.AWSAuthConfig.ClusterName}
if c.Config.AWSAuthConfig.RoleARN != "" {
args = append(args, "--role-arn", c.Config.AWSAuthConfig.RoleARN)
}

apiVersion: v1
kind: Secret
metadata:
  name: cluster-<cluster-name>
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: <cluster-name>
  server: https://<eks-cluster-specific-host>.eks.amazonaws.com
  config: |
    {
      "awsAuthConfig": {
          "clusterName": "<cluster-name>",
      },
      "tlsClientConfig": {
        "caData": "<base64-encoded ca-data>",
        "insecure": false,
      }
    }

We only see it as "Failed" when listing the cluster in the UI. We have increased the log level to debug, but it does not add any new message. We have been able to get getting credentials: exec: exit status 255 when trying to create or sync an application, but I have not been able to identify where it comes from in the code yet.

@savealive
Copy link

savealive commented Aug 23, 2020

Tested both v1.6.2 and v1.7.0-rc1
Role config in account where we run ArgoCD

module "iam_assumable_role" {
  source                        = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
  version                       = "~> v2.18.0"
  create_role                   = true
  role_name                     = "k8s-argocd-admin"
  provider_url                  = replace(data.aws_eks_cluster.this.identity.0.oidc.0.issuer, "https://", "")
  role_policy_arns              = []
  oidc_fully_qualified_subjects = ["system:serviceaccount:argocd:argocd-server", "system:serviceaccount:argocd:argocd-application-controller"]
}

data "aws_eks_cluster" "this" {
  name = var.argocd_cluster_name
}

Target cluster nonprod-useast1-cluster2 aws-auth configmap:

    - "groups":
      - "system:masters"
      "rolearn": "arn:aws:iam::369115111111:role/k8s-argocd-admin"
      "username": "argocd-admin"

ArgoCD cluster config (notice NO role specified here, it's not needed as we allowed access to argocd-managed cluster directly for ArgoCD SA IRSA role):

apiVersion: v1
kind: Secret
metadata:
  name: nonprod-useast1-cluster2
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: nonprod-useast1-cluster2
  server: https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com
  config: |
    {
      "awsAuthConfig": {
        "clusterName": "nonprod-useast1-cluster2"
      },
      "tlsClientConfig": {
        "insecure": false,
        "caData": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EVXlOekUzTkRFeU1Gb1hEVE13TURVeU5URTNOREV5TUZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTmJZCkR4cGdGMHFPdGpWSWhUUVJ0cXdnWC8weEtwdjhRMWJkejdHM0tyU1Z0bWtjYXBIbE94Yit2ODUyRXM5T2liYWMKT0I3eEk4NFpoWnRIZlRJZlNNUG5mSDFsdFUvSXcwZzZ6T0prNGl5bHVrdWxISHJOcXdDY2hvMmRTQno5Sm9NcApuNU1mSGpmV2I5RVppeERZeW5sZXdCK1dWTGIwdkNvTzNNeEloT3RTVG00djB4ZEJMQzBzM29Dd1lmNy9kaFd6CkZsZVZFNmYvY0xkNW1aclRjdlF6TzFzYVIrcEQ4T1FCblVjSXUrT2lXSTV5c2d4SUphWEFJRkZYSU5mWWF5Y2gKZGVJVUdYazAxMjNtMWNRaGJ3eGtEYzhnZkNVMlMxenFHZGVQNElOSGhTcCthSFN4cmJIa3dRYWxO"
      }
    }

ArgoCD v1.6.2
Cluster created but reporting error:
https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com nonprod-useast1-cluster2 Failed Unable to connect to cluster: the server has asked for the client to provide credentials
However, it's possible to deploy applications to it:

gb2   https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com  guestbook1  default  Synced  Healthy  Auto-Prune  <none>      https://github.com/argoproj/argocd-example-apps.git  kustomize-guestbook  HEAD

ArgoCD v1.7.0-rc1
New version has different cluster health discovery logic: it shows a cluster with "unknown status" unless you deploy an app into it.
Once app is deployed cluster becomes "green" in UI. App is deployed successfully.

https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com  nonprod-useast1-cluster2      1.17+    Successful

So I'm pretty sure it was UI bug or bug related to cluster discovery mechanism and it's resolved in v1.7.

@musabmasood
Copy link

musabmasood commented Aug 24, 2020 via email

@okdas
Copy link

okdas commented Aug 26, 2020

@maxbrunet try to shell into the pod and run aws eks --region us-east-1 update-kubeconfig --name cluster-2. That way I found out about missing security context (fsGroup: 999) - aws cli couldn't access IRSA token.

@maxbrunet
Copy link
Contributor

@okdas yes, the fsGroup is needed (when running as non-root), the kubectl cluster-info command I posted runs inside the pod using the WebIdendity token, so it was readable, and as @musabmasood said, we made it work with 1.7. Thanks

@jurgenweber
Copy link

any luck in getting this to work with a ServiceAccount?

@helenabutowatcisco yes, I got it all working. I hope you did also.

@tiagocborg
Copy link

Here's a rough recipe for what I've managed to get working with IAM roles scoped to a Kubernetes ServiceAccount and a single AWS account:

  1. use the latest tag for the images. I couldn't get this to work with v1.6.

  2. you'll need to set the fsGroup for the securityContext to 999.

  3. setup the IAM OpenID Connect provider for the EKS cluster running Argo CD by following https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

  4. create one IAM policy and one role following https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html

    • there needs to be 2 trust relationships for the one role: one for the argocd-server ServiceAccount and one for the argocd-application-controller ServiceAccount
    • the policy should allow AssumeRole and AssumeRoleWithWebIdentity for the STS service with the resources for the policy limited to the ARN for the IAM role
  5. you'll need to add an annotation to the argocd-server and argocd-application-controller ServiceAccounts following https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html

  6. for the EKS clusters that Argo CD will deploying apps into, add a new entry into the aws-auth ConfigMap by following https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html

    • the rolearn and username values should match (though I'm not certain that this is required)
    • for groups, you could use system:masters but you might want to restrict scope further
  7. for the Secret for each cluster, follow https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#clusters

    • be sure to add the label to the Secret as shown in the docs

    • for the Secret data:

      • name should be the ARN of the EKS cluster

      • server should be the URL of the Kubernetes API for the EKS cluster.

      • in the config block only set the following:

        • awsAuthConfig where:

          • clusterName is the name of the EKS cluster as returned by aws eks list-clusters
          • roleARN is the ARN of the IAM role
        • tlsClientConfig where:

          • insecure is false. This might not be required but it doesn't hurt to be explicit.
          • caData is the certificate returned by aws eks describe-cluster --query "cluster.certificateAuthority" --output text. It should already be base64 encoded.
  8. when creating an app with argocd app create, set --dest-server to the URL of the Kubernetes API for the cluster.

@helenabutowatcisco just to see If I understand: on step 6 you added to the aws auth cm running on account B the same role you created on account A with and annotated to the argo service roles?

@modulo1982
Copy link

I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup.

I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters

Hope this helps someone.

@JasonKAls
Copy link

JasonKAls commented May 30, 2021

I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup.

I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters

Hope this helps someone.

This was extremely helpful. There needs to be clearer examples of this in the documentation.

@jackivanov
Copy link

jackivanov commented Jul 5, 2021

Will ahh the above, I'm still getting the server has asked for the client to provide credentials. Has anything changed since then?

@vitarkah
Copy link

vitarkah commented Apr 4, 2022

I merely added the service role of admin (where argocd is running) to the cluster that Argo wants to "manage". This worked, but need to get opinion on "best security practice" here..

- groups:
  - system:bootstrappers
  - system:nodes
  rolearn: arn:aws:iam::<account>:role/<cluster-name>-cluster-ServiceRole-B4UFCVZ99KE8
  username: system:node:{{EC2PrivateDNSName}}

@jgourmelen
Copy link

I follow this tutorial : https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters (With my own EKS cluster)

But i have this error :
[argo-cd-argocd-server-58c46798d5-tjm7j] An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::123456789:assumed-role/node_group-eks-node-group-20220328141247459765646001/i-0fe964864900d3e3 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::123456789:role/Deployer

@mmerickel
Copy link

Just some quick notes on setting this up while sifting through a lot of confused comments above.

  • The role you create (roleA) doesn't need any permissions. It just needs the trust policy so that argocd can assume it just like any IRSA role. And yes the service accounts on both the application-controller and the server need to be able to assume the role. That's enough for argocd to generate eks credentials - just "be" roleA. It doesn't need any attached policies.
  • You can then put roleA into the target cluster's aws-auth map as a role with system:masters in its groups.
  • The roleArn=roleB attribute can be left out if you don't want argocd to assume roleB after assuming roleA. Sometimes people make a separate role in the target account that has access to the cluster, and so you have to assume that first - but it's not required. For example in the blog post they call it the Deployer role.
  • Finally argocd doesn't show any status for the cluster connection until you actually deploy an app into it which is a bit weird.

#2347 (comment) is a great guide, and hopefully I've answered some of the ambiguities in there in a couple spots that had a bit of doubt.

@danielnazareth89
Copy link

we had the below config working great

config: '{"tlsClientConfig":{"insecure":false,"caData":"XX"},"awsAuthConfig":{"clusterName":"<name in EKS console>","roleARN":"arn:aws:iam::00000000:role/argocd-role"}}'
  name: <descriptive name for this cluster>
  server: https://XXXXXXXX.gr7.<region>.eks.amazonaws.com

but this pr appears to have introduced a new CLI for auth without any docs for how to update the config for EKS. Still experimenting and will post back if we can figure out new config.

@AmmarovTou
Copy link

Hi, I have EKS clusters in the same account(the cluster where argocd is running and the external clusters are in the same account).

Approach 1:

I am able to add an external cluster by:

  • creating a kubernetes service account in the external cluster(with a service account token).
  • get the service account token from the external cluster, and provide it in the external cluster config file, in the bearerToken field.
  • deploy the cluster secret(external cluster config) to argocd namespace.
  • then I am able to create and sync the application to the external cluster.

Versions:
EKS Kubernetes 1.20
ArgoCD: v2.4.5


However, the approach above means using a long lived token(the bearerToken) which does not expire.

Approach 2:

So I am trying to use the awsAuthConfig approach, which uses the new argocd-k8s-auth(introduced in: https://github.com/argoproj/argo-cd/pull/8032/files on 15 apr 2022, I think its earliest availability was in v2.4.0-rc1 on 07 May 2022), assuming that argocd-k8s-auth gets shorter lived tokens and also takes care of renewing the token when it is about to expire.

behaviour summary:

When I deploy the external cluster secret, the cluster status is Unknown(which is the usual behavior), as soon as I try to create an application(through argocd UI) for that external cluster, it logs me out of argocd UI, and shows the error:
the server has asked for the client to provide credentials

Versions:
EKS Kubernetes 1.20
ArgoCD: v2.4.5
also tried with v2.3.3 (assuming that this version still had aws cli), and it gave the same error:
the server has asked for the client to provide credentials

The full error message when using argocd v2.4.5:

Unable to create application: error while validating and normalizing app: 
error validating the repo: 
the server has asked for the client to provide credentials

I did increase log level to debug, I didn't notice anything new around that error log line(the same info log lines around it, same as before increasing the log level).

Steps to reproduce:

Assume the following:

  • argocd is deployed in a cluster in eu-west-2.
  • the external cluster is in eu-west-1.
  • both clusters are in the same account.

Then:

  1. prepare AWS IAM role to be assumed by argocd-server and argocd-application-controller pods:
    1.1. Prepare the assume policy doc(specifying both argocd-server and argocd-application-controller service accounts), and call it eu-west-2-cluster-trust.json:
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:oidc-provider/oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>"
          },
          "Action": [
            "sts:AssumeRole",
            "sts:AssumeRoleWithWebIdentity"
          ],
          "Condition": {
            "StringEquals": {
              "oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>:aud": "sts.amazonaws.com",
              "oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>:sub": "system:serviceaccount:argocd:argocd-server"
            }
          }
        },
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:oidc-provider/oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>"
          },
          "Action": [
            "sts:AssumeRole",
            "sts:AssumeRoleWithWebIdentity"
          ],
          "Condition": {
            "StringEquals": {
              "oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>:aud": "sts.amazonaws.com",
              "oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>:sub": "system:serviceaccount:argocd:argocd-application-controller"
            }
          }
        }
      ]
    }
    
    1.2 create the AWS IAM role:
    aws iam create-role --role-name argocd-deployer \
      --assume-role-policy-document file://eu-west-2-cluster-trust.json \
      --description "IAM Role to be used by ArgoCD."
    
    1.3. get the role arn:
    aws iam get-role --role-name argocd-deployer --query Role.Arn
    
    1.4. create policy.json file, and provide the role arn in the Resource field(or just keep it "*" ):
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "ArgoCDSTS",
                "Effect": "Allow",
                "Action": [
                    "sts:AssumeRole",
                    "sts:AssumeRoleWithWebIdentity"
                ],
                "Resource": "*"
            }
        ]
    }
    
    1.5 attach the policy to argocd-deployer role:
    aws iam put-role-policy --role-name argocd-deployer \
      --policy-name ArgoCDSTS \
      --policy-document file://policy.json
    
  2. prepare a service account config and deploy it to the external cluster:
    2.1. service account config, call it argocd-manager-sa.yaml:
    apiVersion: v1
    kind: ServiceAccount
    # with or without the annotation, it didn't make a difference.
    metadata:
      annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer
      name: argocd-manager
      namespace: kube-system
    
    ---
    
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: argocd-manager-role
    
    rules:
    - apiGroups:
      - '*'
      resources:
      - '*'
      verbs:
      - '*'
    - nonResourceURLs:
      - '*'
      verbs:
      - '*'
    
    ---
    
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: argocd-manager-role-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: argocd-manager-role
    subjects:
    # whether the subject kind is group or service account, it didn't make a difference.
    - kind: Group
      name: "argocd:manager"
    
    2.2. deploy the service account config to the external cluster(in eu-west-1):
    kubectl apply -f argocd-manager-sa.yaml --context arn:aws:eks:eu-west-1:<AWS_ACCOUNT_NUMBER>:cluster/<CLUSTER_NAME>
    
    2.3. edit the aws-auth config map of the external cluster(in eu-west-1), and add the argocd-deployer role:
     kubectl --context arn:aws:eks:eu-west-1:<AWS_ACCOUNT_NUMBER>:cluster/<CLUSTER_NAME> -n kube-system edit configmap/aws-auth
    
    under mapRoles add:
        - rolearn: arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer
          username: arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer
          groups:
            - argocd:manager
    
    # as a test, I even assigned the other powerful groups, it didn't make a difference.
        - rolearn: arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer
          username: arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer
          groups:
            - system:masters
            - system:bootstrappers
            - system:nodes
            - argocd:manager
    
  3. in the cluster where argocd is running, patch argocd-server and argocd-application-controller service account with the aws role arn, and the deployments with fsGroup 999:
    kubectl -n argocd patch serviceaccount argocd-application-controller --type=json \
        -p="[{\"op\": \"add\", \"path\": \"/metadata/annotations/eks.amazonaws.com~1role-arn\", \"value\": \"arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer\"}]"
    
    kubectl -n argocd patch serviceaccount argocd-server --type=json \
        -p="[{\"op\": \"add\", \"path\": \"/metadata/annotations/eks.amazonaws.com~1role-arn\", \"value\": \"arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer\"}]"
        
    
    kubectl -n argocd patch deployment argocd-server --type=json  \
        -p='[{"op": "add", "path": "/spec/template/spec/securityContext/fsGroup", "value": 999}]'
    
    kubectl -n argocd patch statefulset argocd-application-controller --type=json \
        -p='[{"op": "add", "path": "/spec/template/spec/securityContext/fsGroup", "value": 999}]'
    
    after the fsGroup patch, argocd-server and argocd-application-controller pods will be restarted, kubectl -n argocd get pods.
  4. create a secret representing the external cluster configs, and call it eu-west-1-cluster-secret.yaml:
    apiVersion: v1
    kind: Secret
    metadata:
      name: eu-west-1-cluster-secret
      namespace: argocd
      labels:
        argocd.argoproj.io/secret-type: cluster
    type: Opaque
    stringData:
      name: arn:aws:eks:eu-west-1:<AWS_ACCOUNT_NUMBER>:cluster/<CLUSTER_NAME>
      server: https://<IDENTIFIER>.gr7.eu-west-1.eks.amazonaws.com
      config: |
        {
          "awsAuthConfig": {
            "clusterName": "arn:aws:eks:eu-west-1:<AWS_ACCOUNT_NUMBER>:cluster/<CLUSTER_NAME>",
            "roleARN": "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer"
          },
          "tlsClientConfig": {
            "insecure": false,
            "caData": "<BASE64_DATA>"
          }
        }
    
    deploy the external cluster secret to argocd:
    kubectl apply -f eu-west-1-cluster-secret.yaml
    
  5. create an application for the external cluster:
    Open ArgoCD UI --> new app --> edit as yaml --> provide the below contents --> save --> create
    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
      name: temp-busybox
    spec:
      destination:
        name: ''
        namespace: default
        server: 'https://<IDENTIFIER>.gr7.eu-west-1.eks.amazonaws.com'
      source:
        path: .
        repoURL: 'https://<SOME_REPO>.git'
        targetRevision: main
        helm:
          valueFiles:
            - values.yaml
      project: default
      syncPolicy:
        syncOptions:
          - ApplyOutOfSyncOnly=true
    
    at this point it logs me out of argocd UI, and shows the error:
    the server has asked for the client to provide credentials

My questions:

  1. Is anyone able to notice anything wrong with the steps above?
  2. Is anyone able to use awsAuthConfig approach? then could you please share?
  3. What should I provide when trying to use awsAuthConfig(which utilizes argocd-k8s-auth)?
  4. In relation to the awsAuthConfig approach, I'm not exactly sure what the newAWSCommand is trying to do, well sure there's a sequence of steps, which returns a token, but I don't know if and how I can re-produce, or verify it manually.
    I did a describe for the argocd-server pod, under environment:
          AWS_DEFAULT_REGION:                                eu-west-2
          AWS_REGION:                                        eu-west-2
          AWS_ROLE_ARN:                                      arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer
          AWS_WEB_IDENTITY_TOKEN_FILE:                       /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    
    so did:
    k -n argocd exec argocd-server-6d5b484ccf-c2m54 cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    
    then decode the token:
    jwt decode <token>
    returns:
    Token header
    ------------
    {
      "alg": "RS256",
      "kid": "<SOME_ID>"
    }
    
    Token claims
    ------------
    {
      "aud": [
        "sts.amazonaws.com"
      ],
      "exp": 1660996837,
      "iat": 1660910437,
      "iss": "https://oidc.eks.eu-west-2.amazonaws.com/id/<OIDC_PROVIDER_ID>",
      "kubernetes.io": {
        "namespace": "argocd",
        "pod": {
          "name": "argocd-server-6d5b484ccf-c2m54",
          "uid": "<SOME_ID>"
        },
        "serviceaccount": {
          "name": "argocd-server",
          "uid": "<SOME_ID>"
        }
      },
      "nbf": 1660910437,
      "sub": "system:serviceaccount:argocd:argocd-server"
    }
    
    shows that this token was valid for 24 hours:
    date -d @1660996837
    Sat 20 Aug 2022 13:00:37 IST
    

Thanks.

@AmmarovTou
Copy link

Hi @jannfis, sorry if I have disturbed you.

Do you reckon I should open a separate issue for my previous comment above?
If so, which issue type would be better, open a bug report or a question?
or is the comment above visible enough, and there is no need for a separate issue?

Thanks.

@ONordander
Copy link

Hey @AmmarovTou
I think I see an issue with your setup:
Your serviceaccount for argocd-server and argocd-application-controller should be annotated with their own roles and not argocd-deployer.
Your awsAuthConfig already specifies that it should use argocd-deployer-role, and since you allow argocd-server & argocd-application-controller to assume that role it should work.
Also you don't need to create the ServiceAccount argocd-manager in the remote cluster, argocd-deployer is already mapped to system:masters so it should be admin.

@bigflood
Copy link

@AmmarovTou

clusterName is the cluster name, not an ARN.
(arn:aws:eks:eu-west-1:<AWS_ACCOUNT_NUMBER>:cluster/<CLUSTER_NAME> -> <CLUSTER_NAME>)

{
  "awsAuthConfig": {
    "clusterName": "<CLUSTER_NAME>",
    "roleARN": "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer"
  },
  "tlsClientConfig": {
    "insecure": false,
    "caData": "<BASE64_DATA>"
  }
}

And if the external cluster is in a different region, you must specify the region.
However, awsAuthConfig doesn't have a region field, so your only option is execProviderConfig.
like below

{
  "execProviderConfig": {
    "apiVersion": "client.authentication.k8s.io/v1beta1",
    "command": "argocd-k8s-auth",
    "args": [
      "aws",
      "--cluster-name",
      "<CLUSTER_NAME>",
      "--role-arn",
      "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/argocd-deployer"
    ],
    "env": {
      "AWS_REGION": "eu-west-1"
    }
  },
  "tlsClientConfig": {
    "insecure": false,
    "caData": "<BASE64_DATA>"
  }
}

@bkk-bcd
Copy link

bkk-bcd commented Jun 11, 2023

Can someone outline what troubleshooting steps they may have taken? I'm not clear yet how to debug it. I'm getting the following error:

getting credentials: exec: executable argocd-k8s-auth not found

@blakepettersson
Copy link
Member

I added some docs on howto configure Argo with EKS (see #14187), hope this helps

@sachin-net
Copy link

Is there any way to dynamically get the caData for any target cluster that resides in a different AWS Account and region.

Background:
I would be using Terraform to deploy argocd in the central / management cluster. For creating the secrets, I want to create the Secrets using Terraform as well. I do not want to store any caData for every cluster within my repo.

If the target account's IAM Role has permission to perform describe-cluster and within awsAuthConfig, we already have information for clusterName and roleARN. Can we somehow make argocd-k8s-auth dynamically fetch caData for a target cluster ?

Equivalent AWS CLI Command : aws eks describe-cluster --name testing --query "cluster.certificateAuthority" --output text

    {
      "awsAuthConfig": {
          "clusterName": "testing",
          "roleARN": "arn:aws:iam::12345678919:role/Deployer"
      },
      "tlsClientConfig": {
        "caData": "XXXYYYZZZ"
      }
    }

@k1rk
Copy link

k1rk commented Jul 31, 2023

@sachin-net if you will be creating secret with terraform you can leverage terraform for this as well. aws_eks_cluster resource provides CA as an output base64decode(aws_eks_cluster.example.certificate_authority[0].data)

@mdellavedova
Copy link

mdellavedova commented Sep 25, 2023

Hi, I'm having similar issues adding a cluster to ArgoCD. The target cluster is in a different region (but in the same AWS account) that the one running ArgoCD. I'm getting the server has asked for the client to provide credentials in the UI and the secret configuration is the following:

apiVersion: v1
kind: Secret
metadata:
  name: target-cluster
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: "my-eks-cluster-name"
  server: https://<REDACTED>.gr7.eu-central-1.eks.amazonaws.com
  config: |
    {
      "execProviderConfig": {
        "apiVersion": "client.authentication.k8s.io/v1beta1",
        "command": "argocd-k8s-auth",
        "args": [
          "aws",
          "--cluster-name",
          "my-eks-cluster-name",
          "--role-arn",
          "arn:aws:iam::<AWS_ACCOUNT_ID>:role/sre/<IAM_ROLE_NAME>"
        ],
        "env": {
          "AWS_REGION": "eu-central-1"
        }
      },
      "tlsClientConfig": {
        "insecure": false,
        "caData": "<base64 encoded certificate>"
      }
    }

I have created IAM roles and annotated the service accounts and edited the aws-auth config map in the target cluster as described in the docs https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#eks
my ArgoCD version: v2.8.4+c279299
EKS version: 1.27

when I run the command in the argocd-server pod it appears to be getting a token:

argocd-k8s-auth aws --cluster-name euc1-1 --role-arn "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/sre/<IAM_ROLE_NAME>"
{"kind":"ExecCredential","apiVersion":"client.authentication.k8s.io/v1beta1","spec":{"interactive":false},"status":{"expirationTimestamp":"2023-09-25T16:18:36Z","token":"k8s-aws-v1.REDACTED"}}

could you please help? I have spent a number of hours troubleshooting this issue and I'm running out of ideas
I should also mention that adding the cluster through the argocd cli (argocd cluster add ...) works as expected

@mmerickel
Copy link

I am not sure what example you’re following to use execProviderConfig but if you go back to just using awsAuthConfig there is no need to care about cross region details. The endpoint is in another region, but IAM is global and argocd can grab the creds from its local region and use them to talk to any cluster endpoint that understands them regardless of region or account.

@mdellavedova
Copy link

@mmerickel I was following the instructions in the post above #2347 (comment)
I have tried with awsAuthConfig and I'm not getting anywhere either... is there anything I can do to troubleshoot further?

@mmerickel
Copy link

mmerickel commented Sep 25, 2023

I've been using my comments here plus the linked comment/instructions to manage cross-account cross-region clusters without issues for a while now. All of the configs look like:

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: test-cluster
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
data:
  name: "cluster name shown in argocd",
  server: "https://eks-control-plane-endpoint",
  config: |
    {
      "awsAuthConfig": {
        "clusterName": "<actual-name-of-eks-cluster>"
      },
      "tlsClientConfig": {
        "caData": "the base64 ca data string"
      }
    }

This is all you need assuming:

  • You setup an IAM role in the AWS account where argocd lives.
  • You granted the argocd server and controller IRSA access to that role.
  • You granted that role system:masters access in the target cluster you're trying to let argocd control.

@mdellavedova
Copy link

@mmerickel thanks for your help, I have changed the setup to what you suggested
secret:

apiVersion: v1
kind: Secret
metadata:
  name: secret-name
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: "cluster name shown in argocd"
  server: "https://<target-eks-control-plane-endpoint>.gr7.eu-central-1.eks.amazonaws.com"
  config: |
    {
      "awsAuthConfig": {
        "clusterName": "<target-cluster-name>"
      },
      "tlsClientConfig": {
        "caData": "<CA-data-of-the-target-cluster>" }        
    }

and checked the IAM role: (trust relationship, no policies attached)

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<AWS-account-number>:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>:sub": [
                        "system:serviceaccount:argocd:argocd-server",
                        "system:serviceaccount:argocd:argocd-application-controller"
                    ],
                    "oidc.eks.eu-west-1.amazonaws.com/id/<argocd-cluster-oidc-id>:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

annotated the service accounts, and updated the aws-auth configmap to point to the role above... same error

could it be a bug with the version of argocd I'm using? or perhaps something to do with the EKS version?

@mmerickel
Copy link

First, confirm the cluster is showing up in argocd's UI. Probably in a disconnected state. Then at least you know the secret is being read correctly. Then look at the argocd-server logs and see if you can find some useful error messages. I think you might be able to click invalidate cache on the cluster to force it to reconnect and see what logs it outputs while it tries to do that.

@blakepettersson
Copy link
Member

As an addition to what @mmerickel said, I'd also take a look in Cloudtrail and check for anything weird going on with IAM there (e.g if the role cannot be assumed for whatever reason)

@blakepettersson
Copy link
Member

Closing this issue for now since this is something which has been working for some time (both cross-account and cross-region), and with #14187 there's documentation on how to configure it.

@blakepettersson blakepettersson closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2023
@sidewinder12s
Copy link

I've been using my comments here plus the linked comment/instructions to manage cross-account cross-region clusters without issues for a while now. All of the configs look like:

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: test-cluster
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
data:
  name: "cluster name shown in argocd",
  server: "https://eks-control-plane-endpoint",
  config: |
    {
      "awsAuthConfig": {
        "clusterName": "<actual-name-of-eks-cluster>"
      },
      "tlsClientConfig": {
        "caData": "the base64 ca data string"
      }
    }

This is all you need assuming:

  • You setup an IAM role in the AWS account where argocd lives.
  • You granted the argocd server and controller IRSA access to that role.
  • You granted that role system:masters access in the target cluster you're trying to let argocd control.

Just checking, @mmerickel your saying you are not using cross-account Roles in each EKS cluster account and instead just allowing your main ArgoCD IAM Role in each clusters aws-auth configmap so that ArgoCD can just directly request creds to the cluster? Reading through the cross-account setup instructions that have landed, I had started questioning why we need the cross-account role when the aws-auth configmap just allows X IAM role access with seemingly no regard to what account it is from.

@mmerickel
Copy link

That's correct, it's how I'm doing it right now. So in my example the IRSA role given to ArgoCD's server is the same one that the cross-account clusters are granting access to, such that there is no extra assume role hops required. So the proposal is:

argocd -> irsa role A -> remote cluster B
                      -> remote cluster C

There are some theoretical advantages to making multiple IAM roles but I didn't think it was worthwhile for me. If you do want an extra IAM role in the middle then you'll need to specify that role in the awsAuthConfig, and grant the irsa role the ability to assume that target role. Then you're doing something like below:

argocd -> irsa role A -> assume role B -> remote cluster B
                      -> assume role C -> remote cluster C

I'll leave you to decide which makes more sense for your org.

@akefirad
Copy link

akefirad commented Jan 19, 2024

I managed to glue everything mainly based on what is shared here, except for adding argocd-manager role in the main cluster (or argocd-deployer in the child cluster, if you wanna use a second role). I couldn't make it work with manually adding the role to aws-auth of the child cluster (to add it to system:masters group), not sure what the issue was, maybe username? (Setting it to the role ARN didn't help. In my case, the role had path, so that might have messed something up 🤷)
Instead I used the new feature of EKS; access entry. Just add the argocd-manager role as a cluster admin to the child cluster (it doesn't matter that it's in the main cluster; i.e. it works with cross-account roles).
Update: It was indeed the path messing it up kubernetes-sigs/aws-iam-authenticator#268

@markandrus
Copy link

@akefirad would you mind sharing how you set up the IAM Access Entry? Does the username matter?

@akefirad
Copy link

akefirad commented Feb 26, 2024

@markandrus which one, IAM Access entry or IRSA entry? The former doesn't need a username. This is our CDKTF code:

      const argocdEntry = new EksAccessEntry(this, "cluster-argocd-access-entry", {
        clusterName: clusterName,
        principalArn: argocdRoleArn, // including the path.
        // kubernetesGroups: ["admin"], // What is this? Why doesn't it working?
      });

      new EksAccessPolicyAssociation(this, "cluster-argocd-access-policy-association", {
        clusterName: clusterName,
        policyArn: "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy",
        principalArn: argocdRoleArn,
        accessScope: { type: "cluster" },
        dependsOn: [argocdEntry],
      });

(I finally managed to make it working with IRSA too, but in my case the issue was that our role has a path, and when setting the role in was-auth configmap, you should drop the path. See the bug reported above.)

Does that help?

@fideloper
Copy link
Contributor

Can we accomplish this without using the aws-auth ConfigMap now a days? I don't have a clear path to automating the update of that ConfigMap (i'm sure you can!? but that seems difficult), and EKS docs claims it's deprecated in favor of creating Access Entries.

However with using Access Entries and skipping the use of aws-auth I get error the server has asked for the client to provide credentials. (Actually I get that error even if I update aws-auth as well!)

@akefirad
Copy link

@fideloper I'm not sure, the issue could be some restriction with access entries. I remember there were some restriction around the feature.

@fideloper
Copy link
Contributor

fideloper commented Oct 22, 2024

I got it working without aws-auth 🎉 (my issue happened to be using the wrong cluster name, which caused argocd-k8s-auth to generate a bad token).

Using Access Entries worked great, no aws-auth required.

Within Terraform, setting up an access entries per cluster looked like this:

(where this is done on each "spoke" cluster, which is an EKS cluster that ArgoCD will manage):

# Create an access entry on a "spoke" EKS cluster so that ArgoCD ("hub" cluster)'s assumed role
# has RBAC permissions to administrate the spoke cluster

resource "aws_eks_access_entry" "argocd_rbac" {
  cluster_name      = "spoke-cluster-name"
  principal_arn     = "arn-of-role-being-assumed-by-argocd"
  kubernetes_groups = []
  type              = "STANDARD"
}

resource "aws_eks_access_policy_association" "argocd_rbac" {
  access_scope {
    type = "cluster"
  }

  cluster_name = "spoke-cluster-name"

  policy_arn    = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
  principal_arn = "arn-of-role-being-assumed-by-argocd"

  depends_on = [
    aws_eks_access_entry.argocd_rbac
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component:multi-cluster Features related to clusters management workaround There's a workaround, might not be great, but exists
Projects
None yet
Development

No branches or pull requests