Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of histogram for timer metrics #19

Closed
marcusmartins opened this issue Jul 23, 2015 · 9 comments
Closed

Usage of histogram for timer metrics #19

marcusmartins opened this issue Jul 23, 2015 · 9 comments

Comments

@marcusmartins
Copy link

Currently, the statsd_bridge uses summary as the mapping for statsd timers and I am wondering what you think about allowing the bridge to decide between using summary or histogram for statsd timers?

Some backstory:
As our clients are already instrumented with statsd, we're considering using the statsd_bridge as an initial step to migrate our webservices metrics into prometheus and validate our setup.

Long term, we plan to use the python-client (and contribute to it) but in the very short term I am somewhat blocked on how to deal with pre-fork servers like gunicorn in a clean way. The statsd_bridge would provide us with an IPC mechanism to send metrics from multiple instance/processes and expose it to prometheus. With the histogram support, that would allow us to run statsd_bridge per instance and expose a per instance metrics endpoint (a very desirable feature).

If that's a direction that is acceptable, a few things will need to be designed/implemented:

  • Are "summary vs histogram" a runtime configuration or something that could be set on a per mapping basis?
  • If the choice can be made on per mapping basis, should that be extended to handle bucket sizes or use a default set of buckets?

To get more familiar with the code, I made a small test branch to validate the histogram implementation:
https://github.com/prometheus/statsd_bridge/compare/master...marcusmartins:histogram_support?expand=1

@juliusv
Copy link
Member

juliusv commented Jul 23, 2015

At an offsite right now, but 👍 in general to using/allowing histograms instead of summaries. For histograms it's quite important to be able to configure the right bucket sizes though, as otherwise they can become quite useless. So that probably needs to be a per-mapping configuration. Not sure about how to integrate that with the configuration mapping language, but open for suggestions!

@marcusmartins
Copy link
Author

Looking at the current implementation of 'name', labels could be be used to indicated bucket arrays and the type of metric to be used.

Something like:

request_duration_microseconds.*.*
name="request_duration_microseconds"
type="histogram"
buckets="5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2500"
view="$1"
method="$2"
job="hub-web"

As buckets and type were not reserved words before, it'd not be a backwards compatible change. A work around would be to use __buckets and __type.

Another option would be to steal from the plain text format and do something like:

# TYPE request_duration_microseconds histogram 
# BUCKETS 5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2500
request_duration_microseconds.*.*
view="$1"
method="$2"
job="hub-web"

It's a bigger change, but it's nice that it separates label definition from the configuration and it couldbe used to define HELP text.

Thoughts?

@juliusv
Copy link
Member

juliusv commented Jul 28, 2015

Yeah, I'd prefer something like the second approach too (don't add more in-band signaling for special-case stuff / metadata). Not sure if we should go for the comment-style format though, because it signals optionality in the text format, and in contrast to the plain text transfer format, at least some of these settings (like buckets) should be mandatory here. Hmmm... I wonder if it's time for a more structured format here in general, but not really time to think about it properly at the moment :-/

@marcusmartins
Copy link
Author

I will try to spend sometime over the weekend getting familiar with the mapper code and see if I can propose a more structured format that could work. My main concern backward compatibility but I think it could be simple enough to migrate that it might not be too bad.

@discordianfish
Copy link
Member

@marcusmartins Did you end working on this? I think it would be very useful.

@marcusmartins
Copy link
Author

@discordianfish I ended up working on a internal fix that solved our immediate problem. I will spend sometime on it this weekend.

@lswith
Copy link

lswith commented Feb 22, 2017

any updates on this?

@drawks
Copy link
Contributor

drawks commented Oct 6, 2017

This issue is resolved with the merge of #66 AFAICT

@discordianfish
Copy link
Member

Yes I think so, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants