Feature request - Prometheus metrics support #318

varun06 · 2018-01-02T15:31:18Z

Now that #149 is closed. Can we decide the right approach and add prometheus metrics support to project.

dmitryilyin · 2018-01-17T19:30:10Z

There is an ongoing effort to make telegraf's input plugin for burrow API here influxdata/telegraf#3489

It can then be used to send metrics to Graphite, InfluxDB, Prometheus and others.

varun06 · 2018-01-17T19:55:17Z

@dmitryilyin does that mean "use telegarf and burrow both" or that effort can be used to add prometheus support to burrow itself?

solsson · 2018-01-17T20:32:27Z

@dmitryilyin What advantages to you see in exporting via Telegraf?

dmitryilyin · 2018-01-18T16:15:01Z

Yes, it means using them both. Adding Prometheus format metrics to burrow is indeed useful, but other people will (and already do) want Graphite output, others are writing InfluxDB connector, and there are many more monitoring systems around.

On the other hand, telegraf works as a swiss army knife. It has a lot of input plugins https://github.com/influxdata/telegraf/tree/master/plugins/inputs and can be easily extended by exec reporter scripts, so it can gather metrics and receive metrics form a lot of things, including gathering system metrics much better then Prometheus' node_exporter, which you should be using, right.
It can output metrics to a lot of things too,
including Prometheus and Graphite https://github.com/influxdata/telegraf/tree/master/plugins/outputs. Although different metric formats and styles can complicate things.

The Prometheus style is to have a lot of different exporters and/or integrate metrics gathering to applications and push gateway for scripts. Which approach is better? Who knows.

If you have only Prometheus and not going to integrate with anything else, then, perhaps, you don't need telegraf at all and can use burrow_exporter or integrate metrics into burrow itself, or maybe you can try telegraf instead if you do need to talk to many other systems.

Anyway, adding Prometheus metrics directly to burrow will be helpful. It will also allow to use telegraf's, Prometheus protocol supports on Burrow instead of using burrow's API. Will it be better remains to be seen.

varun06 · 2018-01-18T19:34:39Z

That make sense, but yeah adding prom support to burrow going to be helpful too.

solsson · 2018-01-18T19:52:32Z

I think it should be noted that exporting to Prometheus doesn't come with the usual complexities of maintaining an integration. It's an HTTP endpoint, nothing else. Very much like the GET endpoints in the /v3 API, but with plaintext instead of JSON.

It'd be great if the discussion for how to map the current responses to Prometheus labels took place in this repo. It affects how useful the exported metrics are for consumer lag monitoring.

If you have only Prometheus and not going to integrate with anything else, then, perhaps, you don't need telegraf at all and can use burrow_exporter or integrate metrics into burrow itself, or maybe you can try telegraf instead if you do need to talk to many other systems.

Using burrow_exporter is ok, though it adds a delay (unless its polling is perfectly synced with Prometheus pull) and some overhead. It too needs a discussion on mapping to labels. Is anyone interested in helping out with jirwin/burrow_exporter#9, i.e. support for the current API version?

solsson · 2018-01-19T10:01:34Z

This is the a sample metric I get out of burrow_exporter after my v3 search-and-replace:

# HELP kafka_burrow_topic_partition_offset The latest offset on a topic's partition as reported by burrow.
# TYPE kafka_burrow_topic_partition_offset gauge
kafka_burrow_topic_partition_offset{cluster="local",partition="12",topic="__consumer_offsets"} 2428

I think these labels make sense.

I had a quick look at the source to try to get the lag export working, but instead of spending time on the structs there... Could anyone hint on how to get hold of these data structures https://github.com/linkedin/Burrow/wiki/Templates#data-in-templates inside Burrow instead, whenever they change?

solsson · 2018-01-19T11:23:41Z

An argument for an external exporter might be that it can do actual integrations without adding to Burrow complexity. For example it could look up owner IPs from partition info in the Kubernetes API, to tag metrics with an optional owner_pod_name.

I think the exporter is ok with v3 since jirwin/burrow_exporter#9 (comment). See sample export there. I think the labels are good, and they'll be forward compatible even if more labels are added later.

Xaelias · 2019-03-14T22:49:42Z

One of the big drawbacks of an external integration like the burrow exporter linked here. Is that it has its own scrape interval. On top of prometheus scrape interval.
Like mentioned above, prometheus metrics are just a plaintext representation of what burrow has. Having that inside burrow shouldn't be a whole lot of complexity. I would rather also not have to rely on 2/3/... projects just to track kafka lags :-D

Xaelias · 2019-03-22T14:02:48Z

Oh also the burrow exporter is actually bugged. It looks like the maintainer is not responsive (although they might respond later hopefully). And I just don't have the go expertise to fix the net/http code myself so...

shamil · 2019-05-30T19:28:30Z

One of the big drawbacks of an external integration like the burrow exporter linked here. Is that it has its own scrape interval. On top of prometheus scrape interval.

This is fixed in my fork, mostly full refactor (except burrow client), I'm now using custom collector implementation, which means scrape happens on demand when /mertrics endpoint scraped by prometheus: https://github.com/shamil/burrow_exporter

solsson mentioned this issue Jan 17, 2018

Add Burrow for consumer lag monitoring Yolean/kubernetes-kafka#125

Merged

toddpalino added the enhancement label Jan 31, 2018

shamil mentioned this issue May 30, 2019

Add prometheus metrics #537

Open

mwain mentioned this issue Apr 28, 2020

Add Prometheus Metrics Exporter #628

Merged

javishere mentioned this issue Jul 3, 2023

Add partition status metric #783

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request - Prometheus metrics support #318

Feature request - Prometheus metrics support #318

varun06 commented Jan 2, 2018

dmitryilyin commented Jan 17, 2018

varun06 commented Jan 17, 2018

solsson commented Jan 17, 2018

dmitryilyin commented Jan 18, 2018 •

edited

Loading

varun06 commented Jan 18, 2018

solsson commented Jan 18, 2018

solsson commented Jan 19, 2018

solsson commented Jan 19, 2018

Xaelias commented Mar 14, 2019

Xaelias commented Mar 22, 2019

shamil commented May 30, 2019 •

edited

Loading

Feature request - Prometheus metrics support #318

Feature request - Prometheus metrics support #318

Comments

varun06 commented Jan 2, 2018

dmitryilyin commented Jan 17, 2018

varun06 commented Jan 17, 2018

solsson commented Jan 17, 2018

dmitryilyin commented Jan 18, 2018 • edited Loading

varun06 commented Jan 18, 2018

solsson commented Jan 18, 2018

solsson commented Jan 19, 2018

solsson commented Jan 19, 2018

Xaelias commented Mar 14, 2019

Xaelias commented Mar 22, 2019

shamil commented May 30, 2019 • edited Loading

dmitryilyin commented Jan 18, 2018 •

edited

Loading

shamil commented May 30, 2019 •

edited

Loading