Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Unknown current metrics but HPA is actually working #270

Closed
gaocegege opened this issue May 21, 2019 · 10 comments
Closed

[bug] Unknown current metrics but HPA is actually working #270

gaocegege opened this issue May 21, 2019 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@gaocegege
Copy link

[root@kube-master-1 ~]$ kubectl -n gaoce describe hpa
Name:                       tea
Namespace:                  gaoce
Labels:                     <none>
Annotations:                kubectl.kubernetes.io/last-applied-configuration:
                              {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"tea","namespace":"gaoce"},"spec"...
CreationTimestamp:          Tue, 21 May 2019 14:03:42 +0800
Reference:                  Deployment/tea
Metrics:                    ( current / target )
  resource cpu on pods:     <unknown> / 1005m
  resource memory on pods:  <unknown> / 300Mi
Min replicas:               1
Max replicas:               2
Deployment pods:            1 current / 1 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:
  Type    Reason             Age    From                       Message
  ----    ------             ----   ----                       -------
  Normal  SuccessfulRescale  9m50s  horizontal-pod-autoscaler  New size: 2; reason: memory resource above target
  Normal  SuccessfulRescale  2m14s  horizontal-pod-autoscaler  New size: 1; reason: All metrics below target

I am running autoscaling/v2beta1 HPA in Kubernetes 1.12. The status.CurrentMetrics is empty, but it seems that the HPA can work according to memory usage.

I run a deployment with one pod which uses 2Mi memory. Then I set:

    - type: Resource
      resource:
        name: memory
        targetAverageValue: 1Mi

Then it scaled up to 2. After a while I set it to

    - type: Resource
      resource:
        name: memory
        targetAverageValue: 300Mi

Then it scaled down to 1.

I am wondering why the HPA works but the status is empty. Is it caused by the metrics server or the HPA controller?

I'd appreciate it if you could help me.

@gaocegege
Copy link
Author

gaocegege commented May 21, 2019

The deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "4"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{},"name":"tea","namespace":"gaoce"},"spec":{"replicas":3,"selector":{"matchLabels":{"app":"tea"}},"template":{"metadata":{"labels":{"app":"tea"}},"spec":{"containers":[{"image":"nginxdemos/hello:plain-text","name":"tea","ports":[{"containerPort":80}]}]}}}}
  creationTimestamp: 2019-05-09T12:24:32Z
  generation: 11
  labels:
    app: tea
  name: tea
  namespace: gaoce
  resourceVersion: "4796775"
  selfLink: /apis/extensions/v1beta1/namespaces/gaoce/deployments/tea
  uid: 662ac519-7255-11e9-9c5d-525400c68fb9
spec:
  progressDeadlineSeconds: 2147483647
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: tea
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: tea
    spec:
      containers:
      - image: nginxdemos/hello:plain-text
        imagePullPolicy: IfNotPresent
        name: tea
        ports:
        - containerPort: 80
          protocol: TCP
        resources:
          limits:
            cpu: 1500m
            memory: 50Mi
          requests:
            cpu: "1"
            memory: 40Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: 2019-05-09T13:11:13Z
    lastUpdateTime: 2019-05-09T13:11:13Z
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 11
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

The HPA:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: tea
  namespace: gaoce
spec:
  maxReplicas: 2
  minReplicas: 1
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: tea
  metrics:
    - type: Resource
      resource:
        name: cpu
        targetAverageValue: 1005m
    - type: Resource
      resource:
        name: memory
        targetAverageValue: 300Mi

The log of metrics server:

I0521 14:19:26.261319       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (667.905µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:19:26.263108       1 reststorage.go:93] No metrics for pod gaoce/tea-69bcd4566b-mbxgv
I0521 14:19:26.263321       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (336.543µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:19:41.260592       1 reststorage.go:93] No metrics for pod gaoce/tea-69bcd4566b-mbxgv
I0521 14:19:41.260900       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (765.019µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:19:41.263064       1 reststorage.go:93] No metrics for pod gaoce/tea-69bcd4566b-mbxgv
I0521 14:19:41.263333       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (523.922µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:20:11.303071       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (353.932µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:20:11.305615       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (465.957µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:20:26.263669       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (349.251µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:20:26.265787       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (323.009µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:20:41.281850       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (388.799µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]
I0521 14:20:41.284136       1 wrap.go:42] GET /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea: (360.698µs) 200 [[hyperkube/v1.12.3 (linux/amd64) kubernetes/435f92c/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 192.168.66.0:16341]

The logs of controller-manager:

I0521 06:18:20.981485       1 event.go:221] Event(v1.ObjectReference{Kind:"HorizontalPodAutoscaler", Namespace:"gaoce", Name:"tea", UID:"2f440383-7b8e-11e9-9ed5-525400c68fb9", APIVersion:"autoscaling/v2beta2", ResourceVersion:"4794722", FieldPath:""}): type: 'Normal' reason: 'SuccessfulRescale' New size: 2; reason: memory resource above target
I0521 06:18:20.994176       1 replica_set.go:477] Too few replicas for ReplicaSet gaoce/tea-69bcd4566b, need 2, creating 1
I0521 06:18:20.994475       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"gaoce", Name:"tea", UID:"662ac519-7255-11e9-9c5d-525400c68fb9", APIVersion:"apps/v1", ResourceVersion:"4794786", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set tea-69bcd4566b to 2
I0521 06:18:21.015107       1 event.go:221] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"gaoce", Name:"tea-69bcd4566b", UID:"1a776ea1-7b81-11e9-aa7a-525400dea63d", APIVersion:"apps/v1", ResourceVersion:"4794788", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: tea-69bcd4566b-mbxgv
I0521 06:25:56.299129       1 event.go:221] Event(v1.ObjectReference{Kind:"HorizontalPodAutoscaler", Namespace:"gaoce", Name:"tea", UID:"2f440383-7b8e-11e9-9ed5-525400c68fb9", APIVersion:"autoscaling/v2beta2", ResourceVersion:"4796708", FieldPath:""}): type: 'Normal' reason: 'SuccessfulRescale' New size: 1; reason: All metrics below target
I0521 06:25:56.315792       1 replica_set.go:525] Too many replicas for ReplicaSet gaoce/tea-69bcd4566b, need 1, deleting 1
I0521 06:25:56.315844       1 controller_utils.go:598] Controller tea-69bcd4566b deleting pod gaoce/tea-69bcd4566b-mbxgv
I0521 06:25:56.316051       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"gaoce", Name:"tea", UID:"662ac519-7255-11e9-9c5d-525400c68fb9", APIVersion:"apps/v1", ResourceVersion:"4796764", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set tea-69bcd4566b to 1
I0521 06:25:56.334750       1 event.go:221] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"gaoce", Name:"tea-69bcd4566b", UID:"1a776ea1-7b81-11e9-aa7a-525400dea63d", APIVersion:"apps/v1", ResourceVersion:"4796767", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: tea-69bcd4566b-mbxgv

@platbr
Copy link

platbr commented May 27, 2019

Same issue here on Kubernetes 1.13.5 (Digital Ocean). It still working but is showing <unknown>.
Im using:

     - command:
       - /metrics-server
       - --kubelet-insecure-tls
       - --kubelet-preferred-address-types=InternalIP,Hostname,ExternalIP
       image: k8s.gcr.io/metrics-server-amd64:v0.3.3

and

	apiVersion: autoscaling/v2beta2
	kind: HorizontalPodAutoscaler
	metadata:
	  name: ingressos-api-jobs-hpa
	  namespace: production
	spec:
	  scaleTargetRef:
	    apiVersion: apps/v1
	    kind: Deployment
	    name: ingressos-api-jobs-deployment
	  minReplicas: 1
	  maxReplicas: 4
	  metrics:
	  - type: Resource
	    resource:
	      name: cpu
	      target:
	        type: AverageValue
	        averageValue: 1000Mi

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 26, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 25, 2019
@gaocegege
Copy link
Author

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 25, 2019
@marcelomanchester
Copy link

marcelomanchester commented Oct 10, 2019

Same issue here on Kubernetes 1.11.3
Deployment:

{
  "kind": "Deployment",
  "apiVersion": "extensions/v1beta1",
  "metadata": {
    "name": "1570648822212-nginx2",
    "namespace": "default",
    "selfLink": "/apis/extensions/v1beta1/namespaces/default/deployments/1570648822212-nginx2",
    "uid": "d949ca28-eac9-11e9-afd5-4e083cf81508",
    "resourceVersion": "56400583",
    "generation": 1,
    "creationTimestamp": "2019-10-09T19:20:26Z",
    "labels": {
      "app": "1570648822212-nginx"
    },
    "annotations": {
      "deployment.kubernetes.io/revision": "1"
    }
  },
  "spec": {
    "replicas": 1,
    "selector": {
      "matchLabels": {
        "app": "1570648822212-nginx"
      }
    },
    "template": {
      "metadata": {
        "creationTimestamp": null,
        "labels": {
          "app": "1570648822212-nginx"
        }
      },
      "spec": {
        "containers": [
          {
            "name": "container-0",
            "image": "nginx",
            "ports": [
              {
                "containerPort": 80,
                "protocol": "TCP"
              }
            ],
            "resources": {},
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "Always"
          }
        ],
        "restartPolicy": "Always",
        "terminationGracePeriodSeconds": 30,
        "dnsPolicy": "ClusterFirst",
        "securityContext": {},
        "schedulerName": "default-scheduler"
      }
    },
    "strategy": {
      "type": "RollingUpdate",
      "rollingUpdate": {
        "maxUnavailable": 1,
        "maxSurge": 1
      }
    },
    "revisionHistoryLimit": 10,
    "progressDeadlineSeconds": 600
  },
  "status": {
    "observedGeneration": 1,
    "replicas": 1,
    "updatedReplicas": 1,
    "readyReplicas": 1,
    "availableReplicas": 1,
    "conditions": [
      {
        "type": "Available",
        "status": "True",
        "lastUpdateTime": "2019-10-09T19:20:26Z",
        "lastTransitionTime": "2019-10-09T19:20:26Z",
        "reason": "MinimumReplicasAvailable",
        "message": "Deployment has minimum availability."
      },
      {
        "type": "Progressing",
        "status": "True",
        "lastUpdateTime": "2019-10-09T19:20:32Z",
        "lastTransitionTime": "2019-10-09T19:20:26Z",
        "reason": "NewReplicaSetAvailable",
        "message": "ReplicaSet \"1570648822212-nginx2-6bd598775b\" has successfully progressed."
      }
    ]
  }
}

HPA:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-10-09T21:11:10Z","reason":"SucceededGetScale","message":"the
      HPA controller was able to get the target''s current scale"},{"type":"ScalingActive","status":"False","lastTransitionTime":"2019-10-09T21:11:10Z","reason":"FailedGetResourceMetric","message":"the
      HPA was unable to compute the replica count: missing request for memory on container
      container-0 in pod default/1570648822212-nginx2-6bd598775b-dz6rp"}]'
    autoscaling.alpha.kubernetes.io/metrics: '[{"type":"Resource","resource":{"name":"memory","targetAverageUtilization":51}}]'
  creationTimestamp: "2019-10-09T21:10:40Z"
  name: 1570648822212-nginx2
  namespace: default
  resourceVersion: "56417959"
  selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/1570648822212-nginx2
  uid: 3f389638-ead9-11e9-afd5-4e083cf81508
spec:
  maxReplicas: 1
  minReplicas: 1
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: 1570648822212-nginx2
  targetCPUUtilizationPercentage: 50
status:
  currentReplicas: 1
  desiredReplicas: 0

@serathius
Copy link
Contributor

Hey, can you verify that Metrics Server is working using "kubectl top pod" and looking into metrics server logs?

@gaocegege
Copy link
Author

@serathius You can see the log here #270 (comment)

@serathius
Copy link
Contributor

Sorry for not noticing it. I don't see any problems in metrics-server logs. Could you verify that Metrics API returns correct results for pods under HPA

kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/gaoce/pods?labelSelector=app%3Dtea

If metrics are correct the problem is not with Metrics Server but with HPA. Please create issue in https://github.com/kubernetes/kubernetes

@gaocegege
Copy link
Author

Sure. But it cannot be always reproduced. I will have a look when I meet it again.

@serathius serathius added the kind/bug Categorizes issue or PR as related to a bug. label Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

6 participants