-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot see metric "kafka.consumer_lag" for consumers using "offsets.storage=kafka" #2611
Comments
Hi @sbrnunes ! It's probably a configuration issue in |
Actually, I believe this is an issue with the check. Unless something changed recently, the agent only checks consumer offsets from ZK and not Kafka. We have the same issue. I've been meaning to fix the check to support Kafka-based offsets, but haven't gotten around to it yet. |
Oh, that makes sense now @degemer. I was getting confused about this as I couldn't see, in the codebase, any code pulling the offsets from Kafka. |
Using the combination of Burrow to monitor offsets and this plugin is working for me. |
The easiest solution would be if upstream See also dpkp/kafka-python#421 and dpkp/kafka-python#509 |
Also having this problem. |
I did a bunch of research into this as part of #2880. The concise summary is that it'll be much simpler to wait until KIP-88 lands before working on this. The basic problem is that Datadog's consumer lag check is trying to grab all consumer offsets from a single place, vs in the Java kafka consumer and most other kafka consumer implementations, the consumer itself knows its offset and can report it somewhere as part of the As KIP-88 explains, there is a workaround that involves creating a dummy consumer, joining to a consumer group, then calling Additionally, no matter how this is implemented, the upstream python library will need to support the call. Either |
I just submitted a PR adding support for this: Please try it out and let me know if you hit issues. We run it at my day job against 6 production kafka clusters. Note that the source is still littered with TODO's as I need to flesh out some of the error-handling and support for time-based offsets. |
Related issue in |
Bumping for attention |
2019 still not fixed |
This still doesn't work? We have consumer groups reporting lag etc via the kafka cli, but the DD agent doesn't seem to pick anything up...
|
This ticket should be closed. cc @ofek The fix in #423 / #654 was merged two years ago. Additionally, I added support in DataDog/integrations-core#3957 for monitoring unlisted consumer groups. So you'll want to make sure your version of this check is upgraded to the latest copy and then read the updated config file comments to make sure you have the configs set properly. If you're still having issues, probably best to open a new ticket rather than re-use this one. |
Yes, anyone that is still experiencing issues should contact support. Thanks! |
Hi
We're trying to set-up the integration between Datadog and Kafka and report metrics for a few consumers that commit the offsets into Kafka (we use "offsets.storage=kafka").
We're able to see metrics for consumers using "offsets.storage=zookeeper", but not for the ones commiting the offsets into Kafka.
We're particularly interested in knowing the consumer lag, which, as far as I know, is reported as "kafka.consumer_lag".
In the logs we can only see the following warning:
2016-06-20 17:03:48 UTC | WARNING | dd.collector | checks.kafka_consumer(kafka_consumer.py:58) | No zookeeper node at /kafka/consumers/<consumer_group>/offsets/<topic>/<partition>
Any idea what we might be doing wrong?
The text was updated successfully, but these errors were encountered: