Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

karmada controller can't find metrics workqueue_depth #5696

Closed
CharlesQQ opened this issue Oct 15, 2024 · 6 comments · Fixed by #5972
Closed

karmada controller can't find metrics workqueue_depth #5696

CharlesQQ opened this issue Oct 15, 2024 · 6 comments · Fixed by #5972
Assignees
Labels
kind/question Indicates an issue that is a support question.

Comments

@CharlesQQ
Copy link
Member

CharlesQQ commented Oct 15, 2024

Please provide an in-depth description of the question you have:

curl karmada metrics endpoint, like curl http://127.0.0.1:10358/metrics, can't found metrics workqueue_depth

controller_runtime_reconcile_total{controller="cluster",result="requeue"} 0
controller_runtime_reconcile_total{controller="cluster",result="requeue_after"} 20665
controller_runtime_reconcile_total{controller="clusterResourceBinding_status_controller",result="requeue"} 0
controller_runtime_reconcile_total{controller="clusterResourceBinding_status_controller",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="clusterresourcebinding",result="requeue"} 0
controller_runtime_reconcile_total{controller="clusterresourcebinding",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="cronfederatedhpa",result="requeue"} 0
controller_runtime_reconcile_total{controller="cronfederatedhpa",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="federatedhpa",result="requeue"} 0
controller_runtime_reconcile_total{controller="federatedhpa",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="federatedresourcequota",result="requeue"} 0
controller_runtime_reconcile_total{controller="federatedresourcequota",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="namespace",result="requeue"} 0
controller_runtime_reconcile_total{controller="namespace",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="remedy-controller",result="requeue"} 0
controller_runtime_reconcile_total{controller="remedy-controller",result="requeue_after"} 0
controller_runtime_reconcile_total{controller="resourceBinding_status_controller",result="requeue"} 0
controller_runtime_reconcile_total{controller="resourceBinding_status_controller",result="requeue_after"} 0
10controller_runtime_reconcile_total{controller="resourcebinding",result="requeue"} 0
0 6139controller_runtime_reconcile_total{controller="resourcebinding",result="requeue_after"} 0
6controller_runtime_reconcile_total{controller="serviceimport",result="requeue"} 0
 controller_runtime_reconcile_total{controller="serviceimport",result="requeue_after"} 0
 controller_runtime_reconcile_total{controller="work",result="requeue"} 0
 controller_runtime_reconcile_total{controller="work",result="requeue_after"} 0
 controller_runtime_reconcile_total{controller="workload-rebalancer",result="requeue"} 0
0controller_runtime_reconcile_total{controller="workload-rebalancer",result="requeue_after"} 0
 61396    0     0  8595k      0 --:--:-- --:--:-- --:--:-- 9992k

But I can find workqueue_depth metric in other controller which developed by ourselves. The versions of controller-runtime are the same with karmada.

curl http://127.0.0.1:8891/metrics  |grep -i queue
controller_runtime_reconcile_total{controller="deployment",result="requeue"} 0
0 15955 controller_runtime_reconcile_total{controller="deployment",result="requeue_after"} 12
   0 15955    0     0   200k      0 --:--:-- --:--:-- --:--:--  202k
# HELP workqueue_adds_total Total number of adds handled by workqueue
# TYPE workqueue_adds_total counter
workqueue_adds_total{name="deployment"} 998
# HELP workqueue_depth Current depth of workqueue
# TYPE workqueue_depth gauge
workqueue_depth{name="deployment"} 0
# HELP workqueue_longest_running_processor_seconds How many seconds has the longest running processor for workqueue been running.
# TYPE workqueue_longest_running_processor_seconds gauge
workqueue_longest_running_processor_seconds{name="deployment"} 0
# HELP workqueue_queue_duration_seconds How long in seconds an item stays in workqueue before being requested
# TYPE workqueue_queue_duration_seconds histogram
workqueue_queue_duration_seconds_bucket{name="deployment",le="1e-08"} 0

What do you think about this question?:
why karmada controller can't find metrics workqueue_depth

workqueue_depth is necessary to check whether there is congestion in the workqueue.

Environment:

  • Karmada version:
    v1.11
  • Kubernetes version:
    v1.23
  • Others:
    sigs.k8s.io/controller-runtime v0.18.4
@CharlesQQ CharlesQQ added the kind/question Indicates an issue that is a support question. label Oct 15, 2024
@XiShanYongYe-Chang
Copy link
Member

Hi @chaosi-zju, do you know it?

@chaosi-zju
Copy link
Member

I have researched this issue and found that it is related to the mixing of workqueue.prometheusMetricsProvider and metrics.workqueueMetricsProvider.

If I remove the following line, all metrics can be found:

workqueue.SetProvider(prometheusMetricsProvider{})

The reason for I try this is that karmada-scheduler only use metrics.workqueueMetricsProvider and its metrics runs fine.

@chaosi-zju
Copy link
Member

I have no further ideas now. If this problem is blocking you, can you try again based on it?

@CharlesQQ
Copy link
Member Author

@chaosi-zju
Copy link
Member

I summarized the root cause and alternative solution as follows:

#5945 (comment)

@RainbowMango
Copy link
Member

/assign @XiShanYongYe-Chang
In favor of @5972

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Indicates an issue that is a support question.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants