Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chart argo-cd: The cluster https://kubernetes.default.svc has no assigned shard. #3092

Open
fmunteanu opened this issue Dec 29, 2024 · 5 comments
Labels
argo-cd bug Something isn't working

Comments

@fmunteanu
Copy link

fmunteanu commented Dec 29, 2024

Describe the bug

When I enable dynamicClusterDistribution into Helm chart, I get these warnings:

time="2024-12-27T23:58:43Z" level=warning msg="conflict when getting shard from shard mapping configMap. Retrying (0/3)"
time="2024-12-27T23:58:43Z" level=warning msg="conflict when getting shard from shard mapping configMap. Retrying (1/3)"
time="2024-12-27T23:58:43Z" level=warning msg="conflict when getting shard from shard mapping configMap. Retrying (2/3)"
time="2024-12-27T23:58:43Z" level=warning msg="conflict when getting shard from shard mapping configMap. Retrying (3/3)"
time="2024-12-27T23:58:50Z" level=warning msg="The cluster https://kubernetes.default.svc has no assigned shard."

If I disable it, I only get this warning:

time="2024-12-27T23:58:50Z" level=warning msg="The cluster https://kubernetes.default.svc has no assigned shard."

ArgoCD works as expected, I can see https://kubernetes.default.svc listed under Clusters and I can create applications without any issues.

Related helm chart

argo-cd

Helm chart version

7.7.11

To Reproduce

I think the issue is related to this condition:

{{- if and (not .Values.createClusterRoles) .Values.controller.dynamicClusterDistribution }}
- apiGroups:
  - ""
  resources:
  - configmaps
  resourceNames:
  - argocd-app-controller-shard-cm
  verbs:
  - get
  - list
  - watch
  - create
  - update
{{- end }}

Expected behavior

By default, the helm chart sets createClusterRoles to true. If I understand correctly, the condition is read as:

IF .Values.createClusterRole IS False AND .Values.controller.dynamicClusterDistribution IS True

createClusterRoles behavior:

  • When createClusterRoles is set to true, ArgoCD's installation process will create the necessary ClusterRole resources. These roles define permissions for ArgoCD components (like the Application Controller) to interact with cluster-wide resources.
  • When createClusterRoles is set to false, the Helm chart or installation script will not create these ClusterRole resources. This may be useful in environments with strict Role-Based Access Control (RBAC) policies where administrators prefer to create and manage such roles manually.

Since the above condition evaluates always to False, argocd-app-controller-shard-cm is never used, hence the warnings present into logs. Do I get this right?

My goal is to be able to keep createClusterRoles enabled and have proper sharding done with multiple controller replicas.

Screenshots

image

image

Additional context

Full troubleshooting details are provided into argoproj/argo-cd#21181, see related chart settings:

$ helm get values argo-cd -n kube-system
USER-SUPPLIED VALUES:
applicationSet:
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      interval: 30s
      namespace: kube-system
      scrapeTimeout: 15s
  pdb:
    enabled: true
    minAvailable: 1
  replicas: 1
  resources:
    limits:
      memory: 128Mi
    requests:
      cpu: 10m
      memory: 128Mi
configs:
  cm:
    accounts.floren: apiKey, login
    accounts.floren.enabled: "true"
    admin.enabled: false
    exec.enabled: true
    statusbadge.enabled: false
  params:
    application.namespaces: kube-system
    applicationsetcontroller.enable.git.submodule: true
    applicationsetcontroller.enable.new.git.file.globbing: true
    applicationsetcontroller.enable.progressive.syncs: true
    applicationsetcontroller.log.level: warn
    controller.log.level: warn
    controller.sharding.algorithm: consistent-hashing
    dexserver.log.level: warn
    notificationscontroller.log.level: warn
    reposerver.log.level: warn
    resource.exclusions: |
      - apiGroups:
          - cilium.io
          - snapshot.storage.k8s.io
        kinds:
          - CiliumIdentity
          - VolumeSnapshot
          - VolumeSnapshotContent
        clusters:
          - "*"
    server.insecure: true
    server.log.level: warn
  rbac:
    policy.csv: |
      g, floren, role:admin
    policy.default: role:readonly
  secret:
    argocdServerAdminPassword: redacted
controller:
  dynamicClusterDistribution: true
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      interval: 30s
      namespace: kube-system
      scrapeTimeout: 15s
  pdb:
    enabled: true
    minAvailable: 1
  replicas: 2
  resources:
    limits:
      memory: 512Mi
    requests:
      cpu: 10m
      memory: 512Mi
dex:
  pdb:
    enabled: true
    minAvailable: 1
  resources:
    limits:
      memory: 128Mi
    requests:
      cpu: 10m
      memory: 128Mi
global:
  domain: argocd.noty.cc
  logging:
    level: warn
notifications:
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      interval: 30s
      namespace: kube-system
      scrapeTimeout: 15s
  pdb:
    enabled: true
    minAvailable: 1
  resources:
    limits:
      memory: 128Mi
    requests:
      cpu: 10m
      memory: 128Mi
redis:
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      interval: 30s
      namespace: kube-system
      scrapeTimeout: 15s
  pdb:
    enabled: true
    minAvailable: 1
  resources:
    limits:
      memory: 128Mi
    requests:
      cpu: 10m
      memory: 128Mi
repoServer:
  autoscaling:
    enabled: true
    maxReplicas: 3
    minReplicas: 1
    targetMemoryUtilizationPercentage: 80
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      interval: 30s
      namespace: kube-system
      scrapeTimeout: 15s
  pdb:
    enabled: true
    minAvailable: 1
  resources:
    limits:
      memory: 128Mi
    requests:
      cpu: 10m
      memory: 128Mi
server:
  autoscaling:
    enabled: true
    maxReplicas: 3
    minReplicas: 1
    targetMemoryUtilizationPercentage: 80
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true
      interval: 30s
      namespace: kube-system
      scrapeTimeout: 15s
  pdb:
    enabled: true
    minAvailable: 1
  resources:
    limits:
      memory: 128Mi
    requests:
      cpu: 10m
      memory: 128Mi

$ kubectl get cm argocd-cmd-params-cm -n kube-system -o json | jq .data
{
  "application.namespaces": "kube-system",
  "applicationsetcontroller.enable.git.submodule": "true",
  "applicationsetcontroller.enable.leader.election": "false",
  "applicationsetcontroller.enable.new.git.file.globbing": "true",
  "applicationsetcontroller.enable.progressive.syncs": "true",
  "applicationsetcontroller.log.format": "text",
  "applicationsetcontroller.log.level": "warn",
  "applicationsetcontroller.namespaces": "",
  "applicationsetcontroller.policy": "sync",
  "controller.ignore.normalizer.jq.timeout": "1s",
  "controller.log.format": "text",
  "controller.log.level": "warn",
  "controller.operation.processors": "10",
  "controller.repo.server.timeout.seconds": "60",
  "controller.self.heal.timeout.seconds": "5",
  "controller.sharding.algorithm": "consistent-hashing",
  "controller.status.processors": "20",
  "dexserver.log.level": "warn",
  "notificationscontroller.log.level": "warn",
  "otlp.address": "",
  "redis.server": "argo-cd-argocd-redis:6379",
  "repo.server": "argo-cd-argocd-repo-server:8081",
  "reposerver.log.format": "text",
  "reposerver.log.level": "warn",
  "reposerver.parallelism.limit": "0",
  "resource.exclusions": "- apiGroups:\n    - cilium.io\n    - snapshot.storage.k8s.io\n  kinds:\n    - CiliumIdentity\n    - VolumeSnapshot\n    - VolumeSnapshotContent\n  clusters:\n    - \"*\"\n",
  "server.basehref": "/",
  "server.dex.server": "https://argo-cd-argocd-dex-server:5556",
  "server.dex.server.strict.tls": "false",
  "server.disable.auth": "false",
  "server.enable.gzip": "true",
  "server.enable.proxy.extension": "false",
  "server.insecure": "true",
  "server.log.format": "text",
  "server.log.level": "warn",
  "server.repo.server.strict.tls": "false",
  "server.rootpath": "",
  "server.staticassets": "/shared/app",
  "server.x.frame.options": "sameorigin"
}
@fmunteanu fmunteanu added the bug Something isn't working label Dec 29, 2024
@yu-croco
Copy link
Collaborator

memo 📝
The permission and condition was implemented in #2743 .

@fmunteanu
Copy link
Author

Hi @yu-croco, do you know how can I address my warnings?

@yu-croco
Copy link
Collaborator

yu-croco commented Dec 30, 2024

Hi @fmunteanu ,

Since the above condition evaluates always to False, argocd-app-controller-shard-cm is never used, hence the warnings present into logs. Do I get this right?

I am not sure if this is the cause since it's just for Dynamic Cluster Distribution with namespaced mode. I guess you are using Argo CD as cluster mote, according to the comment Since the above condition evaluates always to False ?
argo-helm provides the Helm Chart to deploy Argoproj but it doesn't handle the specific features behavior. So I wonder that the issue you opened in upstream would help you more.
*Especially this feature is Alpha, according to doc .

@fmunteanu
Copy link
Author

fmunteanu commented Dec 30, 2024

Thank you, I appreciate the response. Even with the dynamic cluster option disabled, I get the warning detailed into OP, not sure how to address it.

@yu-croco
Copy link
Collaborator

I found the place where this warn is logged, https://github.com/search?q=repo%3Aargoproj%2Fargo-cd%20%22has%20no%20assigned%20shard%22&type=code .
Diving into the upstream's logic will reveal the root cause, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
argo-cd bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants