Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide some guidance for coming up with default histogram buckets for various metrics #316

Open
trask opened this issue Sep 12, 2023 · 2 comments
Assignees

Comments

@trask
Copy link
Member

trask commented Sep 12, 2023

Coming up with default histogram buckets for various metrics can be challenging, as we've seen in #274.

I'd like to propose some guidance for coming up with default histogram buckets which we can lean on in each time the issue of defining default histogram buckets comes up.

Proposal:

  • If values align with service timings, use our default buckets, translated from millis to seconds, [ 0, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 ] (dropping the zero bucket?)
  • Otherwise determine roughly the smallest and largest buckets that you care about, e.g. <0.01 seconds and >10 seconds, and use an exponential range in between (consider using base 2 if you need higher granularity and base 10 if you need lower granularity).

These would only be recommendations, and so if there is compelling reason to come up with a completely custom set of buckets for a particular metric that would always be ok.

@trask
Copy link
Member Author

trask commented Sep 12, 2023

another option could be to follow the pattern [ .1, .25, .5, .75, 1] in between the low and high buckets, e.g.

[ 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 ]

and another similar option with less buckets:

[ 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10 ]

or even fewer:

[ 0.01, 0.05, 0.1, 0.5, 1, 5, 10 ]

@jack-berg
Copy link
Member

Generalizing your last comment into an algorithm we could encode into the spec might look like:

  • Start with base 10 "outer" bucket boundaries.
  • How many and which base 10 orders of magnitude do measurements typically span? E.g. two orders of magnitude (three buckets) starting at 1 would be [1, 10], or [10, 100]if starting at ten. Three order of magnitude (four buckets) starting at 1 would be[1, 10, 100]or[10, 100, 1_000]` if starting at ten.
  • Decide how many inner buckets you need to get a decent distribution useful to most users most of the time. For each outer bucket boundary (skipping the first), the even bucket boundaries are distributed evenly between [0, out_bucket_bound]. Maybe we constrain it to multiples of two so the buckets boundaries don't fall on numbers with infinitely repeating decimals. For example:
    • One inner bucket boundary with two outer bucket orders of magnitude starting at 1 yields: [1, 5, 10]
    • One inner bucket boundary with three outer bucket order of magnitude starting at 1 yields: [1, 5, 10, 50, 100]
    • Three inner bucket boundaries with two outer bucket orders of magnitude starting at 1 yields: [1, 2.5, 5, 7.5, 10]
    • Three inner bucket boundaries with three outer bucket orders of magnitude starting at 1 yields: [1, 2.5, 5, 7.5, 10, 25, 50, 75, 100]
  • The number of buckets becomes equal to orders_of_magnitude * (inner_bucket_boundaries + 1)

The advantage of laying down some convention like this is while there are still domain specific decisions to debate like "how many orders of magnitude is typical?" and "how many inner buckets yield a useful distribution for most users?", it does narrow the solution space quite a bit. It also produces bucket boundaries which are nice even numbers with an intuitive explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants