[Discussion required] Prometheus metrics export #1415

carlpett · 2016-05-14T16:48:04Z

Related issue: #1230
This is a first draft of how to expose metrics to Prometheus, using the support built-in in go-metrics.
The main pain point is likely in how the http endpoint is registered - I'm not sure how to best go about it. It should only be registered if the config option is enabled, for a start, and probably not in the main handler package. Can/should the mount mechanism be used?

Also, I would prefer if it would be possible to set DisableHostname when exporting to Prometheus (the hostname will be collected as part of the scrape anyway). However, if some users want to have multiple telemetry options enabled, it shouldn't affect all of them, I think. I don't see an easy way to change it to per-telemetry sink, though. Maybe creating some per-sink option-map? Might be a bit over-doing it?

carlpett · 2016-05-14T18:56:59Z

Looks like I may have misunderstood how you work with godep. Will fix if this PR is interesting to continue on.

Nalum · 2016-05-15T09:29:04Z

Definitely interested in this. Will have time to look over it later on today.

carlpett · 2016-05-17T08:02:39Z

@Nalum Did you have a chance to have a take a look?

Nalum · 2016-05-17T12:58:31Z

http/handler.go

@@ -39,6 +41,9 @@ func Handler(core *vault.Core) http.Handler {
 	mux.Handle("/v1/sys/", handleLogical(core, true, nil))
 	mux.Handle("/v1/", handleLogical(core, false, nil))

+	// This is probably wrong in quite a few ways - unsure how to register it properly? Something like Mounting, maybe?


I would look at keeping this in line with the other telemetry options, i.e. it is in the configuration file and not something that the user mounts via Vault commands. This looks good to me but I'd prefer to get feedback from the devs on it.

Yeah, there is a configuration file option which I use in command/server.go. However, the configuration is not available in this scope. So, something else needs to be done, I'm just not sure what. I was thinking that maybe I could reuse the mount mechanics, not requiring that the user does it manually.

Ah, okay. I definitely think some input from HashiCorp on this would be good to get. I agree auto mounting would be the way to go with it. But wouldn't be too sure on the way to implement it as I've not looked at the code in any depth.

Not sure about go-metrics, but from plain Prometheus this is "the way". Just squat on /metrics, unconditionally – because of the pull model, there is nothing to configure and if it's not getting queried you're not paying any price for this.

In some Prometheus binaries we allow configuring the metrics endpoint via a flag, but that's about it. Example: https://github.com/prometheus/node_exporter/blob/master/node_exporter.go#L124

jefferai · 2016-05-19T04:58:52Z

Sorry for the radio silence...we're slammed working towards the next release. I don't have experience with Prometheus, but I work with the guy who wrote the go-metrics library, so my plan is to find out any integration preferences with him and get this slotted in for the release after next.

SuperQ · 2016-06-15T15:31:31Z

!summon @juliusv :-)

jefferai · 2016-06-23T19:07:02Z

I think my preference would be to have a separate listener on a separate port, but I don't know if this breaks the normal model of Prometheus. The go-metrics guy (@armon) is on vacation but may have some preferences as well.

juliusv · 2016-06-23T19:12:48Z

From a Prometheus perspective, a separate port is totally fine, if needed.

jefferai · 2016-06-23T19:26:59Z

I think the nice thing there is that there is no potential contention within the Vault API, and no interaction with Vault's core by the Prometheus code.

carlpett · 2016-06-24T09:20:17Z

@jefferai Good idea! I'm away a few days, but I'll change the PR monday.

carlpett · 2016-07-01T18:42:16Z

Took a bit longer than expected, but I took another stab at it now. I've moved the metrics endpoint to it's own http server on port 8201. I've a few question marks regarding configuration with this approach, however - should it attempt to mimic the settings of the (real) Vault server with regards to IP/TLS settings etc? Right now I just use plain http on :8201 if nothing is explicitly set in the config file.

Would it be possible/advisable to piggyback on the Listener type instead? That would seem to give a lot of these things for "free". But it might have other implications?

SuperQ · 2016-07-01T18:45:27Z

Moving metrics to a separate port is not a good idea. It adds unnecessary operational complexity.

carlpett · 2016-07-01T19:07:44Z

@SuperQ I'm not entirely convinced either, but since @jefferai and Julius seemed to believe it was a better match, I gave it a try at least. I can see Jeff's point about isolating it from Vault's core.

Even so, I would also prefer a solution which did not use another port, but I'll need some guidance on where to attach in that case. Just dropping it in the main server initialization as I did in the first version was a bit ugly.

jefferai · 2016-07-18T14:34:27Z

We had our employee summit last week and had a conversation about this across the various HC projects (and other monitoring solutions that require you to build in a listener).

Unfortunately, we've decided that we do not want to support this mode of operation. Adding in additional handler code unnecessarily opens up further attack surface and creates an ongoing maintainability burden, since if the expected API changes from the upstream monitoring service it requires updating within Vault.

We think the overall better approach would be to simply create a tiny handler that ingresses the data from Vault (e.g. via statsite or statsd) and exposes a Prometheus endpoint for it. That decouples the API expectations from Vault itself and does not require additional network-facing code.

Sorry about the long delay on this...it just took a while to get all of the right people together to discuss.

roboll · 2016-07-18T14:41:16Z

for anyone watching, i'm using this successfully. https://github.com/prometheus/statsd_exporter

jefferai · 2016-07-18T15:14:10Z

@roboll That seems like a perfect answer :-D

carlpett · 2016-07-18T16:28:23Z

@roboll Yup, that is our current setup. We'll stick to it, then :)
@jefferai Thanks for taking the time to get it right!

tcolgate · 2016-08-17T07:23:44Z

Would /v1/sys/metrics be so bad? It could even require a token if the data was considered sensitive. The statsd_exporter isn't terrible, but it's far from ideal either (you can't really auth a UDP statsd endpoint at this time). It's also something else people have to run and manage. You do lose a few other things this way.

Metrics have to be sent to specific places, pull lets me arbitrarily fire up new collectors (for either HA or testing purposes, or adhoc monitoring).
You also lose the potential help text that can be sent to a prometheus client/.
The prometheus client library gives you some nice process insights "for free" (e.g. go_goroutines)

jefferai · 2016-08-17T12:11:18Z

@tcolgate An authenticated /1/sys/metrics that allows access to go-metrics data wouldn't be bad. The issue with Prometheus is that it requires running network-handling code that we have no control over, and from a security perspective that's not something we wanted to bake into Vault.

tcolgate · 2016-08-17T12:34:12Z

The prometheus metrics format is trivial to implement yourselves if it is
the client library you are specifically concerns about. The format is plain
text (optionally protobuf, but plain text is fine), metric value pairs. You
would lose some of the things the go client provides you automatically, but
that's not a deal breaker.
If the endpoint per-se isn't the issue, presumably the text format
wouldn't be an issue, the only potential blocker there is the
implementation detail (i.e. not using client_golang).
Would a PR that achieved that be open for consideratoin?

On Wed, 17 Aug 2016 at 13:11 Jeff Mitchell notifications@github.com wrote:

@tcolgate https://github.com/tcolgate An authenticated /1/sys/metrics
that allows access to go-metrics data wouldn't be bad. The issue with
Prometheus is that it requires running network-handling code that we have
no control over, and from a security perspective that's not something we
wanted to bake into Vault.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1415 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAEo84WVzLgeGhI9FYuyJAYI59olmBlgks5qgvpvgaJpZM4IepVV
.

carlpett · 2016-08-17T19:37:27Z

@tcolgate That is a good idea! I suppose this would need to go into the go-metrics package first? Splitting the existing PrometheusSink into one using client_golang and one which exposes a good api for building a custom http handler on top?

discordianfish · 2016-08-18T10:15:00Z

I'm using statsd for consul metrics and it was a PITA to set it up, it proves less accurate metrics (both from timestamp and due to pre-summing of things on the statsd side of things).
The design of go-metrics also doesn't allow proper handling of pull based monitoring, the Prometheus support there is lacking. I've tried to discuss options overe here for consul:

Stats over HTTP API consul#817

I understand if people are reluctant to depend on a 3rd party library but IMO there should be at least some way to pull the current metric state in whatever format. That is the only way to get "correct" metrics in a pull based monitoring system.

crazed · 2016-08-18T14:29:44Z

@roboll do you happen to have the statsd_exporter config that you used to fix up the metrics? Currently using it, but without a configuration, so get a ton of mostly hard to use metrics.

Outside of that, I would love if we had an http endpoint to pull metrics from similar to what elasticsearch does. That way it would be easy to write an exporter that pulls from there.

jippi · 2017-12-14T20:48:18Z

You can also use https://github.com/seatgeek/statsd-rewrite-proxy to convert consul telemetry into tagged metrics for datadog or similar tagged data

discordianfish · 2017-12-15T10:54:03Z

@jefferai Already re-evaluated this? You wouldn't be the first ones. Prometheus metrics format became the defacto standard of metrics in the 'cloud native' world. It's not only intended for consumption from Prometheus: Datadog for example uses the same format to gather metrics from things like kubernetes components.

First draft of prometheus metrics export

8918fe9

Nalum reviewed May 17, 2016
View reviewed changes

jefferai added this to the 0.6.1 milestone May 19, 2016

Run metrics server on separate port

8b461ae

jefferai closed this Jul 18, 2016

carlpett deleted the f-prometheus branch July 18, 2016 18:45

jeinwag mentioned this pull request Sep 19, 2016

Stats over HTTP API hashicorp/consul#817

Closed

carlpett mentioned this pull request Oct 6, 2016

Prometheus #1230

Closed

cosmopetrich mentioned this pull request Jun 30, 2017

Telemetry: add prometheus endpoint option #2937

Closed

uepoch mentioned this pull request Aug 29, 2018

Support same metrics endpoints as nomad and consul #5223

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion required] Prometheus metrics export #1415

[Discussion required] Prometheus metrics export #1415

carlpett commented May 14, 2016

carlpett commented May 14, 2016

Nalum commented May 15, 2016

carlpett commented May 17, 2016

Nalum May 17, 2016

carlpett May 17, 2016

Nalum May 17, 2016

matthiasr Jun 15, 2016

juliusv Jun 15, 2016

jefferai commented May 19, 2016

SuperQ commented Jun 15, 2016

jefferai commented Jun 23, 2016

juliusv commented Jun 23, 2016

jefferai commented Jun 23, 2016

carlpett commented Jun 24, 2016

carlpett commented Jul 1, 2016

SuperQ commented Jul 1, 2016 via email

carlpett commented Jul 1, 2016

jefferai commented Jul 18, 2016

roboll commented Jul 18, 2016

jefferai commented Jul 18, 2016

carlpett commented Jul 18, 2016

tcolgate commented Aug 17, 2016

jefferai commented Aug 17, 2016

tcolgate commented Aug 17, 2016

carlpett commented Aug 17, 2016

discordianfish commented Aug 18, 2016

crazed commented Aug 18, 2016

jippi commented Dec 14, 2017

discordianfish commented Dec 15, 2017

[Discussion required] Prometheus metrics export #1415

[Discussion required] Prometheus metrics export #1415

Conversation

carlpett commented May 14, 2016

carlpett commented May 14, 2016

Nalum commented May 15, 2016

carlpett commented May 17, 2016

Nalum May 17, 2016

Choose a reason for hiding this comment

carlpett May 17, 2016

Choose a reason for hiding this comment

Nalum May 17, 2016

Choose a reason for hiding this comment

matthiasr Jun 15, 2016

Choose a reason for hiding this comment

juliusv Jun 15, 2016

Choose a reason for hiding this comment

jefferai commented May 19, 2016

SuperQ commented Jun 15, 2016

jefferai commented Jun 23, 2016

juliusv commented Jun 23, 2016

jefferai commented Jun 23, 2016

carlpett commented Jun 24, 2016

carlpett commented Jul 1, 2016

SuperQ commented Jul 1, 2016 via email

carlpett commented Jul 1, 2016

jefferai commented Jul 18, 2016

roboll commented Jul 18, 2016

jefferai commented Jul 18, 2016

carlpett commented Jul 18, 2016

tcolgate commented Aug 17, 2016

jefferai commented Aug 17, 2016

tcolgate commented Aug 17, 2016

carlpett commented Aug 17, 2016

discordianfish commented Aug 18, 2016

crazed commented Aug 18, 2016

jippi commented Dec 14, 2017

discordianfish commented Dec 15, 2017