-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking peak/max value #358
Comments
@kornelski could you expand on your use-case a bit more? Which Prometheus queries are you planning to run on this gauge? In case you want to ensure not missing spikes across Prometheus scrapes, try modelling your use-case with two counters instead of one gauge. E.g. for a queue instead of one gauge tracking the size of the queue, have two counters, one incremented on enqueue one incremented on dequeue. |
In my case it's number of concurrent server requests being processed. I increase a gauge when a request comes in, and decrease when it's done. The problem is that when the gauge is scraped, it's close to 0 most of the time, because requests are processed pretty quickly. But I have some very sudden traffic spikes, and I'd like to know how many requests hit my server exactly at the same time. The solution with two counters is interesting, but I think they'd also be equal most of the time when they're scraped, so I need to add extra instrumentation that catches momentary peaks between scrapes. |
How about simply increase a counter when a request comes in? Then you can know the concurrency of the requests by using |
No, that gives rate at which they come, but it can't see how many of them are being actively processed in parallel. In terms of queueing theory, I have a steady state where arrivals equal departures. I can easily measure rate of arrivals and departures, but I want to know queue length, and not typical/average/sampled length, but maximum queue length reached. |
Thanks for the details @kornelski. I am not directly opposed to exposing some of the atomic operations to the user. I would like to suggest another alternative to the two Say queue_length_histogram.observe(num_concurrent_requests.inc() + 1); Depending on you bucket distribution you can get a good approximation on the max queue length by subtracting the highest accumulating bucket count with the second highest accumulating bucket count. Compared to the two gauge approach you (a) don't have a race condition and (b) not only get the maximum queue length between scrapes, but the approximated queue length distribution across the scrape interval e.g. via quantiles. Let me know what you think. |
I don't quite follow why subtract bucket counts. I think approximate maximum could be found by looking for a bucket that represents the highest value and has non-zero hit count. So |
I'd like to have a gauge that precisely tracks a peak value of another gauge (I have a gauge that goes up and down, and its temporary peak value is more interesting than current value at any time). I'm not sure what's the best way to implement it.
Currently gauges have only inc/get/set, so I have something like this:
but the get+set doesn't seem elegant, and isn't atomic. Maybe you could expose fetch_max?
If
inc()
returned current value I could even do:The text was updated successfully, but these errors were encountered: