Pulsar upgrade to 3.0.5 causes prometheus metrics timeouts on brokers #22897
Unanswered
justin-lathrop
asked this question in
Q&A
Replies: 2 comments 15 replies
-
I deployed instead the 3.0.4 release and the broker metrics started working again. I wonder if its related to some of the metrics related changes in this commit? Which was put into the 3.0.5 release. |
Beta Was this translation helpful? Give feedback.
15 replies
-
@lhotari PTAL when you have the chance |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
After performing an upgrade from pulsar 2.9.4 to pulsar 3.0.5 within a Kubernetes cluster using pulsar helm 3.0 chart the Prometheus Metrics stopped working via the pulsar-brokers.
The pulsar-broker logs show a constant stream of 500 responses with timeouts, and 302 redirects.
Running
pulsar-admin broker-stats monitoring-metrics
does return metrics with values of data moving through. But exec'd into the pulsar-broker-0 pod and `curl http://localhost:8080/metrics/" only times out.The pulsar-proxy instance was also upgraded as part of this and it is reporting metrics as expected with no timeouts by curling the pulsar-proxy-0 metrics endpoint like so.
curl http://localhost:8080/metrics/
from within the pod.The error reported seems to be at this line in the code, but at present I do not see any of the other possible logs, it just seems to timeout every time in the async process. https://github.com/apache/pulsar/blob/branch-3.0/pulsar-broker/src/main/java/org/apache/pulsar/broker/stats/prometheus/PulsarPrometheusMetricsServlet.java#L83
Any help would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions