Usage of histogram for timer metrics #19

marcusmartins · 2015-07-23T16:53:47Z

Currently, the statsd_bridge uses summary as the mapping for statsd timers and I am wondering what you think about allowing the bridge to decide between using summary or histogram for statsd timers?

Some backstory:
As our clients are already instrumented with statsd, we're considering using the statsd_bridge as an initial step to migrate our webservices metrics into prometheus and validate our setup.

Long term, we plan to use the python-client (and contribute to it) but in the very short term I am somewhat blocked on how to deal with pre-fork servers like gunicorn in a clean way. The statsd_bridge would provide us with an IPC mechanism to send metrics from multiple instance/processes and expose it to prometheus. With the histogram support, that would allow us to run statsd_bridge per instance and expose a per instance metrics endpoint (a very desirable feature).

If that's a direction that is acceptable, a few things will need to be designed/implemented:

Are "summary vs histogram" a runtime configuration or something that could be set on a per mapping basis?
If the choice can be made on per mapping basis, should that be extended to handle bucket sizes or use a default set of buckets?

To get more familiar with the code, I made a small test branch to validate the histogram implementation:
https://github.com/prometheus/statsd_bridge/compare/master...marcusmartins:histogram_support?expand=1

juliusv · 2015-07-23T19:45:51Z

At an offsite right now, but 👍 in general to using/allowing histograms instead of summaries. For histograms it's quite important to be able to configure the right bucket sizes though, as otherwise they can become quite useless. So that probably needs to be a per-mapping configuration. Not sure about how to integrate that with the configuration mapping language, but open for suggestions!

marcusmartins · 2015-07-24T13:09:51Z

Looking at the current implementation of 'name', labels could be be used to indicated bucket arrays and the type of metric to be used.

Something like:

request_duration_microseconds.*.*
name="request_duration_microseconds"
type="histogram"
buckets="5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2500"
view="$1"
method="$2"
job="hub-web"

As buckets and type were not reserved words before, it'd not be a backwards compatible change. A work around would be to use __buckets and __type.

Another option would be to steal from the plain text format and do something like:

# TYPE request_duration_microseconds histogram 
# BUCKETS 5, 10, 25, 50, 100, 250, 500, 1000, 1500, 2500
request_duration_microseconds.*.*
view="$1"
method="$2"
job="hub-web"

It's a bigger change, but it's nice that it separates label definition from the configuration and it couldbe used to define HELP text.

Thoughts?

juliusv · 2015-07-28T12:37:29Z

Yeah, I'd prefer something like the second approach too (don't add more in-band signaling for special-case stuff / metadata). Not sure if we should go for the comment-style format though, because it signals optionality in the text format, and in contrast to the plain text transfer format, at least some of these settings (like buckets) should be mandatory here. Hmmm... I wonder if it's time for a more structured format here in general, but not really time to think about it properly at the moment :-/

marcusmartins · 2015-07-31T17:30:18Z

I will try to spend sometime over the weekend getting familiar with the mapper code and see if I can propose a more structured format that could work. My main concern backward compatibility but I think it could be simple enough to migrate that it might not be too bad.

discordianfish · 2016-05-02T18:27:09Z

@marcusmartins Did you end working on this? I think it would be very useful.

marcusmartins · 2016-05-04T17:40:40Z

@discordianfish I ended up working on a internal fix that solved our immediate problem. I will spend sometime on it this weekend.

lswith · 2017-02-22T05:56:05Z

any updates on this?

drawks · 2017-10-06T20:17:59Z

This issue is resolved with the merge of #66 AFAICT

discordianfish · 2017-10-07T10:03:51Z

Yes I think so, thanks!

juliusv mentioned this issue Aug 20, 2015

Mapped metric name cannot be dynamic #20

Closed

grobie added the enhancement label Feb 25, 2017

bakins mentioned this issue Mar 13, 2017

Allow histograms for timer metrics #66

Merged

discordianfish closed this as completed Oct 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage of histogram for timer metrics #19

Usage of histogram for timer metrics #19

marcusmartins commented Jul 23, 2015

juliusv commented Jul 23, 2015

marcusmartins commented Jul 24, 2015

juliusv commented Jul 28, 2015

marcusmartins commented Jul 31, 2015

discordianfish commented May 2, 2016

marcusmartins commented May 4, 2016

lswith commented Feb 22, 2017

drawks commented Oct 6, 2017

discordianfish commented Oct 7, 2017

Usage of histogram for timer metrics #19

Usage of histogram for timer metrics #19

Comments

marcusmartins commented Jul 23, 2015

juliusv commented Jul 23, 2015

marcusmartins commented Jul 24, 2015

juliusv commented Jul 28, 2015

marcusmartins commented Jul 31, 2015

discordianfish commented May 2, 2016

marcusmartins commented May 4, 2016

lswith commented Feb 22, 2017

drawks commented Oct 6, 2017

discordianfish commented Oct 7, 2017