-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion required] Prometheus metrics export #1415
Conversation
Looks like I may have misunderstood how you work with |
Definitely interested in this. Will have time to look over it later on today. |
@Nalum Did you have a chance to have a take a look? |
@@ -39,6 +41,9 @@ func Handler(core *vault.Core) http.Handler { | |||
mux.Handle("/v1/sys/", handleLogical(core, true, nil)) | |||
mux.Handle("/v1/", handleLogical(core, false, nil)) | |||
|
|||
// This is probably wrong in quite a few ways - unsure how to register it properly? Something like Mounting, maybe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would look at keeping this in line with the other telemetry options, i.e. it is in the configuration file and not something that the user mounts via Vault commands. This looks good to me but I'd prefer to get feedback from the devs on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is a configuration file option which I use in command/server.go. However, the configuration is not available in this scope. So, something else needs to be done, I'm just not sure what. I was thinking that maybe I could reuse the mount mechanics, not requiring that the user does it manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. I definitely think some input from HashiCorp on this would be good to get. I agree auto mounting would be the way to go with it. But wouldn't be too sure on the way to implement it as I've not looked at the code in any depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about go-metrics, but from plain Prometheus this is "the way". Just squat on /metrics
, unconditionally – because of the pull model, there is nothing to configure and if it's not getting queried you're not paying any price for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some Prometheus binaries we allow configuring the metrics endpoint via a flag, but that's about it. Example: https://github.com/prometheus/node_exporter/blob/master/node_exporter.go#L124
Sorry for the radio silence...we're slammed working towards the next release. I don't have experience with Prometheus, but I work with the guy who wrote the go-metrics library, so my plan is to find out any integration preferences with him and get this slotted in for the release after next. |
!summon @juliusv :-) |
I think my preference would be to have a separate listener on a separate port, but I don't know if this breaks the normal model of Prometheus. The |
From a Prometheus perspective, a separate port is totally fine, if needed. |
I think the nice thing there is that there is no potential contention within the Vault API, and no interaction with Vault's core by the Prometheus code. |
@jefferai Good idea! I'm away a few days, but I'll change the PR monday. |
Took a bit longer than expected, but I took another stab at it now. I've moved the metrics endpoint to it's own http server on port 8201. I've a few question marks regarding configuration with this approach, however - should it attempt to mimic the settings of the (real) Vault server with regards to IP/TLS settings etc? Right now I just use plain http on Would it be possible/advisable to piggyback on the |
Moving metrics to a separate port is not a good idea. It adds unnecessary
operational complexity.
|
@SuperQ I'm not entirely convinced either, but since @jefferai and Julius seemed to believe it was a better match, I gave it a try at least. I can see Jeff's point about isolating it from Vault's core. Even so, I would also prefer a solution which did not use another port, but I'll need some guidance on where to attach in that case. Just dropping it in the main server initialization as I did in the first version was a bit ugly. |
We had our employee summit last week and had a conversation about this across the various HC projects (and other monitoring solutions that require you to build in a listener). Unfortunately, we've decided that we do not want to support this mode of operation. Adding in additional handler code unnecessarily opens up further attack surface and creates an ongoing maintainability burden, since if the expected API changes from the upstream monitoring service it requires updating within Vault. We think the overall better approach would be to simply create a tiny handler that ingresses the data from Vault (e.g. via statsite or statsd) and exposes a Prometheus endpoint for it. That decouples the API expectations from Vault itself and does not require additional network-facing code. Sorry about the long delay on this...it just took a while to get all of the right people together to discuss. |
for anyone watching, i'm using this successfully. https://github.com/prometheus/statsd_exporter |
@roboll That seems like a perfect answer :-D |
Would /v1/sys/metrics be so bad? It could even require a token if the data was considered sensitive. The statsd_exporter isn't terrible, but it's far from ideal either (you can't really auth a UDP statsd endpoint at this time). It's also something else people have to run and manage. You do lose a few other things this way.
|
@tcolgate An authenticated |
The prometheus metrics format is trivial to implement yourselves if it is On Wed, 17 Aug 2016 at 13:11 Jeff Mitchell notifications@github.com wrote:
|
@tcolgate That is a good idea! I suppose this would need to go into the |
I'm using statsd for consul metrics and it was a PITA to set it up, it proves less accurate metrics (both from timestamp and due to pre-summing of things on the statsd side of things). I understand if people are reluctant to depend on a 3rd party library but IMO there should be at least some way to pull the current metric state in whatever format. That is the only way to get "correct" metrics in a pull based monitoring system. |
@roboll do you happen to have the statsd_exporter config that you used to fix up the metrics? Currently using it, but without a configuration, so get a ton of mostly hard to use metrics. Outside of that, I would love if we had an http endpoint to pull metrics from similar to what elasticsearch does. That way it would be easy to write an exporter that pulls from there. |
You can also use https://github.com/seatgeek/statsd-rewrite-proxy to convert consul telemetry into tagged metrics for datadog or similar tagged data |
@jefferai Already re-evaluated this? You wouldn't be the first ones. Prometheus metrics format became the defacto standard of metrics in the 'cloud native' world. It's not only intended for consumption from Prometheus: Datadog for example uses the same format to gather metrics from things like kubernetes components. |
Related issue: #1230
This is a first draft of how to expose metrics to Prometheus, using the support built-in in go-metrics.
The main pain point is likely in how the http endpoint is registered - I'm not sure how to best go about it. It should only be registered if the config option is enabled, for a start, and probably not in the main handler package. Can/should the mount mechanism be used?
Also, I would prefer if it would be possible to set
DisableHostname
when exporting to Prometheus (the hostname will be collected as part of the scrape anyway). However, if some users want to have multiple telemetry options enabled, it shouldn't affect all of them, I think. I don't see an easy way to change it to per-telemetry sink, though. Maybe creating some per-sink option-map? Might be a bit over-doing it?