KubeJobCompletion Prometheus alert for desheduler jobs #432

KR411-prog · 2020-10-28T22:27:10Z

We are receiving KubeJobCompletion Prometheus alert for desheduler jobs with the below alert message,
kube-system/descheduler-1603424400 is taking more than 12 hours to complete.

The descheduler values file config is as shown below,

fullnameOverride: descheduler
nameOverride: descheduler
deschedulerPolicy:
  nodeSelector: kops.k8s.io/instancegroup=nodes
  evict-local-storage-pods: false
  maxNoOfPodsToEvictPerNode: 4
  strategies:
    RemoveDuplicates:
      enabled: true
    RemovePodsViolatingInterPodAntiAffinity:
      enabled: true
    RemovePodsViolatingNodeAffinity:
      enabled: true
    LowNodeUtilization:
      enabled: true
      params:
        numberOfNodes: 2
        nodeResourceUtilizationThresholds:
          # node is underutilized if all 3 metrics are below the threshold
          thresholds:
            "cpu" : 50
            "memory": 30
            "pods": 30
          # node is overutilized is any of these 3 metrics is above the target threshold
          targetThresholds:
            "cpu" : 80
            "memory": 70
            "pods": 50
    RemovePodsHavingTooManyRestarts:
      enabled: true
      params:
        podsHavingTooManyRestarts:
          podRestartThreshold: 100
          includingInitContainers: true
    PodLifeTime:
      enabled: false
rbac:
  create: true
schedule: "*/2 * * * *"

I am not sure if there is a way in the config to tune to delete the job if it takes more than 30 mins. I dont find that tuning configuration in the values file in this helm chart.
chart: descheduler/descheduler-helm-chart
version: "0.19.0"

Any help on how to avoid getting this alert? Is there any tuning that can be done in descheduler config?

The text was updated successfully, but these errors were encountered:

seanmalloy · 2020-10-29T03:10:33Z

/triage support

k8s-ci-robot · 2020-10-29T03:10:34Z

@seanmalloy: The label(s) triage/support cannot be applied, because the repository doesn't have them

In response to this:

/triage support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

seanmalloy · 2020-10-29T03:12:59Z

/kind support

seanmalloy · 2020-10-29T03:21:22Z

@KR411-prog thanks for opening this issue. Please provide the below details and we will try to help.

Please provide the full CronJob yaml with any sensitive info redacted.

kubectl get cronjob -n mynamespace mycronjob -o yaml

What k8s version are you using?

Please provide the pod log for long running descheduler CronJob pod with any sensitive info redacted.

kubectl logs -n mynamespace mypod

seanmalloy · 2020-10-29T03:24:00Z

@KR411-prog it would also be helpful to know if the descheduler pod is maxing out it's CPU/Memory requests or limits. Also, roughly how many nodes and pods are in the cluster?

Thanks!

KR411-prog · 2020-10-29T17:59:20Z

Here is the cronjob config,

apiVersion: v1
items:
- apiVersion: batch/v1beta1
  kind: CronJob
  metadata:
    annotations:
      meta.helm.sh/release-name: descheduler
      meta.helm.sh/release-namespace: kube-system
    creationTimestamp: "2020-09-17T22:12:55Z"
    labels:
      app.kubernetes.io/instance: descheduler
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: descheduler
      app.kubernetes.io/version: 0.19.0
      helm.sh/chart: descheduler-helm-chart-0.19.0
    name: descheduler
    namespace: kube-system
    resourceVersion: "56889398"
    selfLink: /apis/batch/v1beta1/namespaces/kube-system/cronjobs/descheduler
    uid: 8ca58fd8-4988-41fb-910a-d2f0cc7e7e9c
  spec:
    concurrencyPolicy: Forbid
    failedJobsHistoryLimit: 1
    jobTemplate:
      metadata:
        creationTimestamp: null
      spec:
        template:
          metadata:
            annotations:
              checksum/config: ea19993a2d8da1e8b8774c541cfb67debbdb62ae9505a4bc4ca238320c271805
            creationTimestamp: null
            labels:
              app.kubernetes.io/instance: descheduler
              app.kubernetes.io/name: descheduler
            name: descheduler
          spec:
            containers:
            - args:
              - --policy-config-file
              - /policy-dir/policy.yaml
              - --v
              - "3"
              command:
              - /bin/descheduler
              image: k8s.gcr.io/descheduler/descheduler:v0.19.0
              imagePullPolicy: IfNotPresent
              name: descheduler-helm-chart
              resources: {}
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
              - mountPath: /policy-dir
                name: policy-volume
            dnsPolicy: ClusterFirst
            priorityClassName: system-cluster-critical
            restartPolicy: Never
            schedulerName: default-scheduler
            securityContext: {}
            serviceAccount: descheduler
            serviceAccountName: descheduler
            terminationGracePeriodSeconds: 30
            volumes:
            - configMap:
                defaultMode: 420
                name: descheduler
              name: policy-volume
    schedule: '*/10 * * * *'
    successfulJobsHistoryLimit: 3
    suspend: false
  status:
    lastScheduleTime: "2020-10-24T00:40:00Z"
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

I am unable to take logs because I dont have this issue today.As soon as I get this issue again,I can share the logs here.
163 pods are in this cluster
6 worker nodes
Its an EKS cluster - Kubernetes 1.15

seanmalloy · 2020-10-29T21:12:04Z

@KR411-prog thanks for providing the additional details. One thing I see is that the descheduler container requests/limits are not set. This might be a bug in the helm chart, but I did not dig into the helm chart to see if there is an option to set that.

Without the logs it will be difficult to determine root cause. Please add the descheduler pod logs if you see this happen again.

Thanks!

KR411-prog · 2020-10-30T21:40:42Z

Today we got same issue..

kubectl get jobs -n kube-system
NAME                     COMPLETIONS   DURATION   AGE
descheduler-1604088720   1/1           37s        6m17s
descheduler-1604088840   1/1           37s        4m26s
descheduler-1604088960   1/1           36s        2m25s
descheduler-1604089080   0/1           24s        24s

We received KubeJobCompletion alert for descheduler-1604089080 job. But the logs in the pod had no error, but within 1 min or so the pod and job was deleted automicatically.

Now I see only new jobs,

kubectl get jobs -n kube-system
NAME                     COMPLETIONS   DURATION   AGE
descheduler-1604092080   1/1           36s        4m59s
descheduler-1604092200   1/1           36s        2m58s
descheduler-1604092320   1/1           37s        58s

So the job which showed problem didnt have any failed status in it,

kubectl describe job descheduler-1604089080 -n kube-system


Name:           descheduler-1604089080
Namespace:      kube-system
Selector:       controller-uid=dd8c06ea-c2cd-42de-9eef-06af31e74d40
Labels:         app.kubernetes.io/instance=descheduler
                app.kubernetes.io/name=descheduler
                controller-uid=dd8c06ea-c2cd-42de-9eef-06af31e74d40
                job-name=descheduler-1604089080
Annotations:    <none>
Controlled By:  CronJob/descheduler
Parallelism:    1
Completions:    1
Start Time:     Fri, 30 Oct 2020 13:18:02 -0700
Completed At:   Fri, 30 Oct 2020 13:18:38 -0700
Duration:       36s
Pods Statuses:  0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:           app.kubernetes.io/instance=descheduler
                    app.kubernetes.io/name=descheduler
                    controller-uid=dd8c06ea-c2cd-42de-9eef-06af31e74d40
                    job-name=descheduler-1604089080
  Annotations:      checksum/config: 93edabe3808159d55ef01771bbe791b880656fd7f010a59731302c452628f9cc
  Service Account:  descheduler
  Containers:
   descheduler-helm-chart:
    Image:      k8s.gcr.io/descheduler/descheduler:v0.19.0
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/descheduler
    Args:
      --policy-config-file
      /policy-dir/policy.yaml
      --v
      3
    Environment:  <none>
    Mounts:
      /policy-dir from policy-volume (rw)
  Volumes:
   policy-volume:
    Type:               ConfigMap (a volume populated by a ConfigMap)
    Name:               descheduler
    Optional:           false
  Priority Class Name:  system-cluster-critical
Events:
  Type    Reason            Age    From            Message
  ----    ------            ----   ----            -------
  Normal  SuccessfulCreate  2m19s  job-controller  Created pod: descheduler-1604089080-j7pl6

By checking the cronjob manifest file, I see ConcurrencyPolicy set as Forbid. Does adding "startingDeadlineSeconds: 10" helps in improving cronjob behaviour?

KR411-prog · 2020-10-30T22:04:39Z

There is another issue today.
descheduler Job showed age as 98m.
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed

Pod logs showed an error,

  Warning  FailedCreatePodSandBox  20m (x4308 over 100m)  kubelet  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "48779b946f6a932a7d8f4271ac75f4e06da5bbb76a13f708e5a0c33bed5c3f65" network for pod "descheduler-1604088600-m522f": NetworkPlugin cni failed to set up pod "descheduler-1604088600-m522f_kube-system" network: add cmd: failed to assign an IP address to container
  Normal   SandboxChanged          12s (x5396 over 100m)  kubelet  Pod sandbox changed, it will be killed and re-created.

I think by setting activeDeadlineSeconds, this issue can be resolved. But I dont find this field "activeDeadlineSeconds" in values file of descheduler chart.

KR411-prog · 2020-11-04T20:55:26Z

I found the below error in today's issue,

Events:
  Type     Reason            Age    From            Message
  ----     ------            ----   ----            -------
  Normal   SuccessfulCreate  7m11s  job-controller  Created pod: time-limited-rbac-1604522700-qqgct
  Normal   SuccessfulDelete  2m31s  job-controller  Deleted pod: time-limited-rbac-1604522700-qqgct
  Warning  DeadlineExceeded  2m31s  job-controller  Job was active longer than specified deadline

Pods are deleted but the job was still in failed status with the error "Job was active longer than specified deadline".
Is there anything that can be done to tune descheduler config to fix this issue?

damemi · 2020-11-13T18:47:02Z

I see you are running descheduler v0.19, and you also mentioned your cluster is k8s 1.15. Please note that we currently only support k8s to descheduler version N-3 (see https://github.com/kubernetes-sigs/descheduler/#compatibility-matrix)

I'm not sure if that will relate to your problem, but this seems like more of an issue with the cronjob (though any logs you could get from the descheduler pod would be the best way to tell, possibly at a higher log level like v=4). Do you ever have similar problems running cron jobs for other tools on your cluster?

If you can't resolve that, another option is running the descheduler as a regular deployment with the --descheduling-interval flag set

seanmalloy · 2020-11-18T06:54:05Z

In my opinion this is a problem in the descheduler helm chart and also the k8s manifests found in the top level kubernetes directory of this repo. There are multiple problems.

CPU and Memory requests/limits are not set
.spec.startingDeadlineSeconds is not configurable for the CronJob

Item 1 from above is a bug in my opinion. I supposed item 2 would be a feature enhancement request.

/kind bug

seanmalloy · 2020-11-20T05:31:29Z

I think I can get the helm chart and k8s yaml manifests updated to hopefully mitigate this issue.

/assign
/remove-kind support

k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Oct 29, 2020

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 18, 2020

k8s-ci-robot assigned seanmalloy Nov 20, 2020

k8s-ci-robot removed the kind/support Categorizes issue or PR as a support question. label Nov 20, 2020

This was referenced Nov 20, 2020

Set Container Resources In YAML Manifests #442

Merged

Set Container Resources In Helm Chart #443

Merged

Update Helm Chart To Allow Setting startingDeadlineSeconds #444

Merged

k8s-ci-robot closed this as completed in #444 Nov 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KubeJobCompletion Prometheus alert for desheduler jobs #432

KubeJobCompletion Prometheus alert for desheduler jobs #432

KR411-prog commented Oct 28, 2020 •

edited

Loading

seanmalloy commented Oct 29, 2020

k8s-ci-robot commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

KR411-prog commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

KR411-prog commented Oct 30, 2020

KR411-prog commented Oct 30, 2020

KR411-prog commented Nov 4, 2020

damemi commented Nov 13, 2020

seanmalloy commented Nov 18, 2020

seanmalloy commented Nov 20, 2020

KubeJobCompletion Prometheus alert for desheduler jobs #432

KubeJobCompletion Prometheus alert for desheduler jobs #432

Comments

KR411-prog commented Oct 28, 2020 • edited Loading

seanmalloy commented Oct 29, 2020

k8s-ci-robot commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

KR411-prog commented Oct 29, 2020

seanmalloy commented Oct 29, 2020

KR411-prog commented Oct 30, 2020

KR411-prog commented Oct 30, 2020

KR411-prog commented Nov 4, 2020

damemi commented Nov 13, 2020

seanmalloy commented Nov 18, 2020

seanmalloy commented Nov 20, 2020

KR411-prog commented Oct 28, 2020 •

edited

Loading