Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metricbeat Kafka Module Enhancements #3005

Closed
3 of 5 tasks
ruflin opened this issue Nov 15, 2016 · 7 comments
Closed
3 of 5 tasks

Metricbeat Kafka Module Enhancements #3005

ruflin opened this issue Nov 15, 2016 · 7 comments
Labels

Comments

@ruflin
Copy link
Member

ruflin commented Nov 15, 2016

Community contributions:

@ruflin ruflin mentioned this issue Nov 15, 2016
@tbragin tbragin changed the title Kafka Module Enhancements Metricbeat Kafka Module Enhancements Jan 12, 2018
@ruflin ruflin added the module label Feb 26, 2018
@rpn0709
Copy link

rpn0709 commented Mar 21, 2018

One of the most important metrics to monitor in kafka is lag. This looks like it could be calculated with the following "kafka.partition.offset.newest - kafka.consumergroup.offset" but those value are in two different metricsets with no identical fields to search by. If we can get these two values in one metricset or rename the topic name field to the same thing in both documents that may work.

@urso urso mentioned this issue May 8, 2018
11 tasks
@jsoriano jsoriano self-assigned this May 8, 2018
@urso
Copy link

urso commented May 9, 2018

About lag: consumer groups are potentially handled by different brokers. The metricbeat might need to connect to remote kafka node in order to collect and correlate additional information

@urso
Copy link

urso commented May 9, 2018

As of now, metricbeat is required to be installed on every kafka node and should collect stats from local node only. => Add 'cluster' mode to kafka module. One configures a few hosts for bootstrapping only, but then the module should collect cluster wide stats. This will allow metricbeat to collect stats from remote cluster, without having to list every single node in a cluster.

@TimWardOrigami
Copy link

Agree that the main motivation for monitoring Kafka is to track consumer group lag. Ideally the lag should be a field in an event, or we should be able to calculate it by subtracting two fields in the same event.

The information is actually present and correct, but as it's in two different events I think there's no way to draw graphs of it in Kibana? - and none of my threshold detection and alerting code has any concept of doing a subtraction across two different events.

(My particular problem is that someone can send a storm of spam into Kafka, and I'd rather like to spot that a consumer group has got a couple of days behind so that I can do something about it.)

@jsoriano
Copy link
Member

@gingerwizard prepared a dashboard for demo.elastic.co, it can be found here, it includes the consumer group lag. To help corelating the data in different events we have renamed and added some fields. We expect to include these changes and the dashboard in metricbeat 6.5.

@botelastic
Copy link

botelastic bot commented Jan 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the Stalled label Jan 5, 2021
@jsoriano
Copy link
Member

jsoriano commented Jan 5, 2021

Closing, there are already metrics for partition offsets and consumer lag. There are other issues open about improving support for consumers as #22859 and #23077.

@jsoriano jsoriano closed this as completed Jan 5, 2021
@zube zube bot removed the [zube]: Done label Apr 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants