-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request - Prometheus metrics support #318
Comments
There is an ongoing effort to make telegraf's input plugin for burrow API here influxdata/telegraf#3489 It can then be used to send metrics to Graphite, InfluxDB, Prometheus and others. |
@dmitryilyin does that mean "use telegarf and burrow both" or that effort can be used to add prometheus support to burrow itself? |
@dmitryilyin What advantages to you see in exporting via Telegraf? |
Yes, it means using them both. Adding Prometheus format metrics to burrow is indeed useful, but other people will (and already do) want Graphite output, others are writing InfluxDB connector, and there are many more monitoring systems around. On the other hand, telegraf works as a swiss army knife. It has a lot of input plugins https://github.com/influxdata/telegraf/tree/master/plugins/inputs and can be easily extended by exec reporter scripts, so it can gather metrics and receive metrics form a lot of things, including gathering system metrics much better then Prometheus' node_exporter, which you should be using, right. The Prometheus style is to have a lot of different exporters and/or integrate metrics gathering to applications and push gateway for scripts. Which approach is better? Who knows. If you have only Prometheus and not going to integrate with anything else, then, perhaps, you don't need telegraf at all and can use burrow_exporter or integrate metrics into burrow itself, or maybe you can try telegraf instead if you do need to talk to many other systems. Anyway, adding Prometheus metrics directly to burrow will be helpful. It will also allow to use telegraf's, Prometheus protocol supports on Burrow instead of using burrow's API. Will it be better remains to be seen. |
That make sense, but yeah adding prom support to burrow going to be helpful too. |
I think it should be noted that exporting to Prometheus doesn't come with the usual complexities of maintaining an integration. It's an HTTP endpoint, nothing else. Very much like the GET endpoints in the /v3 API, but with plaintext instead of JSON. It'd be great if the discussion for how to map the current responses to Prometheus labels took place in this repo. It affects how useful the exported metrics are for consumer lag monitoring.
Using burrow_exporter is ok, though it adds a delay (unless its polling is perfectly synced with Prometheus pull) and some overhead. It too needs a discussion on mapping to labels. Is anyone interested in helping out with jirwin/burrow_exporter#9, i.e. support for the current API version? |
This is the a sample metric I get out of burrow_exporter after my v3 search-and-replace:
I think these labels make sense. I had a quick look at the source to try to get the lag export working, but instead of spending time on the structs there... Could anyone hint on how to get hold of these data structures https://github.com/linkedin/Burrow/wiki/Templates#data-in-templates inside Burrow instead, whenever they change? |
An argument for an external exporter might be that it can do actual integrations without adding to Burrow complexity. For example it could look up I think the exporter is ok with v3 since jirwin/burrow_exporter#9 (comment). See sample export there. I think the labels are good, and they'll be forward compatible even if more labels are added later. |
One of the big drawbacks of an external integration like the burrow exporter linked here. Is that it has its own scrape interval. On top of prometheus scrape interval. |
Oh also the burrow exporter is actually bugged. It looks like the maintainer is not responsive (although they might respond later hopefully). And I just don't have the go expertise to fix the net/http code myself so... |
This is fixed in my fork, mostly full refactor (except burrow client), I'm now using custom collector implementation, which means scrape happens on demand when |
Now that #149 is closed. Can we decide the right approach and add prometheus metrics support to project.
The text was updated successfully, but these errors were encountered: