-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metricbeat Kafka Module Enhancements #3005
Comments
One of the most important metrics to monitor in kafka is lag. This looks like it could be calculated with the following "kafka.partition.offset.newest - kafka.consumergroup.offset" but those value are in two different metricsets with no identical fields to search by. If we can get these two values in one metricset or rename the topic name field to the same thing in both documents that may work. |
About lag: consumer groups are potentially handled by different brokers. The metricbeat might need to connect to remote kafka node in order to collect and correlate additional information |
As of now, metricbeat is required to be installed on every kafka node and should collect stats from local node only. => Add 'cluster' mode to kafka module. One configures a few hosts for bootstrapping only, but then the module should collect cluster wide stats. This will allow metricbeat to collect stats from remote cluster, without having to list every single node in a cluster. |
Agree that the main motivation for monitoring Kafka is to track consumer group lag. Ideally the lag should be a field in an event, or we should be able to calculate it by subtracting two fields in the same event. The information is actually present and correct, but as it's in two different events I think there's no way to draw graphs of it in Kibana? - and none of my threshold detection and alerting code has any concept of doing a subtraction across two different events. (My particular problem is that someone can send a storm of spam into Kafka, and I'd rather like to spot that a consumer group has got a couple of days behind so that I can do something about it.) |
@gingerwizard prepared a dashboard for demo.elastic.co, it can be found here, it includes the consumer group lag. To help corelating the data in different events we have renamed and added some fields. We expect to include these changes and the dashboard in metricbeat 6.5. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Community contributions:
The text was updated successfully, but these errors were encountered: