Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram does not provide accessors #241

Open
cratelyn opened this issue Nov 18, 2024 · 0 comments · May be fixed by #242
Open

Histogram does not provide accessors #241

cratelyn opened this issue Nov 18, 2024 · 0 comments · May be fixed by #242

Comments

@cratelyn
Copy link
Contributor

💬 background

Counter and Gauge provide a get() method, which are helpful for various use cases. currently, Histogram does not provide any equivalent way for callers to observe its count, sum, or buckets.

for my particular use case, i am interested in test coverage that confirms that Prometheus metrics are properly installed and measuring behavior in my application. because there is not any way to inspect histograms however, i am left in an unfortunate position where i can only write test coverage for my application if i use counters and gauges.

so, i'd like to propose the addition of a few new methods to Histogram.

⚖️ proposal (part 1)

currently the histogram structure is defined as such:

pub struct Histogram {
    inner: Arc<RwLock<Inner>>,
}


#[derive(Debug)]
pub(crate) struct Inner {
    sum: f64,
    count: u64,
    buckets: Vec<(f64, u64)>,
}

first, i'd like to propose the following, which seem least controversial and most straightforward:

impl Histogram {
    /// Returns the current sum of all observations.
    pub fn sum(&self) -> f64 {
        self.inner.read().sum
    }

    /// Returns the current number of observations.
    pub fn count(&self) -> u64 {
        self.inner.read().count
    }
}

these two methods would address most of my particular concerns with respect to introspection in tests, and would help bring Histogram in line with the interfaces provided by Gauge and Counter metrics.

⚖️ proposal (part 2)

it would also be very helpful if there was a way to access the buckets.

most simply, we could provide a way to acquire an iterator over the bucket tuples. i'd imagine this would look approximately like:

impl Histogram {
    /// Returns a view of the histogram's buckets.
    pub fn buckets(&self) -> MappedRwLockReadGuard<[(f64, u64)]> {
        let inner = self.inner.read();
        let buckets = RwLockReadGuard::map(inner, |inner| inner.buckets.as_slice());
        buckets
    }
}

we could provide a way to get the bucket for a given observed value. the nice part about this is that a caller wouldn't need to know how a histogram has been configured, and spares them the trouble/boilerplate of iterating through all of the buckets to find the proper bucket.

impl Histogram {
    /// Returns the bucket given an observed value.
    pub fn lookup(&self, v: f64) -> Option<(f64, u64)> {
        let inner = self.inner.read();
        inner
            .buckets
            .iter()
            .find(|(upper_bound, _value)| upper_bound >= &v)
            .copied()
    }
}

note that this isn't mutually exclusive with the buckets() accessor, we could provide either/both of these.

that said, i'm less attached to these bucket interfaces. i'll also note that the MappedRwLockReadGuard would be subject to the same deadlock concerns mentioned in #231, but this signature seems like it follows the spirit of other existing interfaces in this library.


i am especially interested in adding sum() and count(), though i am also happy to help with adding the other methods if these are also welcome.

cratelyn added a commit to cratelyn/prometheus-client that referenced this issue Nov 18, 2024
fixes prometheus#241.

this commit introduces two new public methods to `Histogram`; `sum()`
and `count()` return the sum of all observations and the number of
observations made, respectively.

Signed-off-by: katelyn martin <me+cratelyn@katelyn.world>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 20, 2024
see:

* prometheus/client_rust#242
* prometheus/client_rust#241

for now, refactor this test so that it gates all use of the (proposed)
`sum()` and `count()` accessors behind a temporary feature gate.

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 20, 2024
see:

* prometheus/client_rust#242
* prometheus/client_rust#241

for now, refactor this test so that it gates all use of the (proposed)
`sum()` and `count()` accessors behind a temporary feature gate.

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 20, 2024
see:

* prometheus/client_rust#242
* prometheus/client_rust#241

for now, refactor this test so that it gates all use of the (proposed)
`sum()` and `count()` accessors behind a temporary feature gate.

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 20, 2024
* feat(app): Backend response frame count metrics

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC response bodies, and observing (a) the
number of frames yielded by a body, and (b) the number of bytes included
in body frames, and (c) the distribution of frame sizes.

this middleware allows operators to reason about how large or small the
packets being served in a backend's response bodies are.

a route-level middleware that instruments request bodies will be added
in a follow-on PR.

 ### 📝 changes

an overview of changes made here:

* the `linkerd-http-prom` has a new `body_data` submodule. it exposes
  `request` and `response` halves, to be explicit about which body is
  being instrumented on a `tower::Service`.

* the `linkerd-http-prom` crate now has a collection of new
  dependencies: `bytes` is added as a dependency in order to inspect the
  data chunk when the inner body yields a new frame. `futures-util` and
  `http-body` are added as dev-dependencies for the accompanying test
  coverage.

* body metrics are affixed to the `RouteBackendMetrics<L>` structure,
  and registered at startup.

Signed-off-by: katelyn martin <kate@buoyant.io>

* review: Inline attribute to service passthrough

Signed-off-by: katelyn martin <kate@buoyant.io>

* review: Inline attribute to body passthrough

continuing this theme of inlining, we inline the passthrough methods on
`Body` as well.

Signed-off-by: katelyn martin <kate@buoyant.io>

* review: Box `<RecordBodyData as Service>::Future` values

Signed-off-by: katelyn martin <kate@buoyant.io>

* review: rename `get()` to `metrics()`

Signed-off-by: katelyn martin <kate@buoyant.io>

* review: simplify `RecordBodyData<S>` response

Signed-off-by: katelyn martin <kate@buoyant.io>

* Update ResponseBody metrics to use a histogram

Signed-off-by: Oliver Gould <ver@buoyant.io>

* refactor(tests): feature gate frame size histogram assertions

see:

* prometheus/client_rust#242
* prometheus/client_rust#241

for now, refactor this test so that it gates all use of the (proposed)
`sum()` and `count()` accessors behind a temporary feature gate.

Signed-off-by: katelyn martin <kate@buoyant.io>

---------

Signed-off-by: katelyn martin <kate@buoyant.io>
Signed-off-by: Oliver Gould <ver@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 21, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, (b) the number of bytes included
in body frames, and (c) a distribution of the size of frames yielded.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

see prometheus/client_rust#241 and prometheus/client_rust#242, which
track upstream proposals to add accessors to `Histogram` that will allow
us to make test assertions that metrics are working properly. for now,
these are feature gated as also done in #3308.

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 21, 2024
 ### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, (b) the number of bytes included
in body frames, and (c) a distribution of the size of frames yielded.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

see prometheus/client_rust#241 and prometheus/client_rust#242, which
track upstream proposals to add accessors to `Histogram` that will allow
us to make test assertions that metrics are working properly. for now,
these are feature gated as also done in #3308.

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit to linkerd/linkerd2-proxy that referenced this issue Nov 21, 2024
### ⛅ overview

this introduces a new tower middleware for Prometheus metrics, used for
instrumenting HTTP and gRPC request bodies, and observing (a) the
number of frames yielded by a body, (b) the number of bytes included
in body frames, and (c) a distribution of the size of frames yielded.

this builds upon the backend-level metrics added in #3308. this
additionally uses the route label extractor, hoisted out of the retry
middleware's Prometheus telemetry in #3337.

 ### 📝 changes

* a `linkerd_http_prom::body_data::request::NewRecordBodyData::NewRecordBodyData`
  middleware is added, which complements the equivalent
  `linkerd_http_prom::body_data::response` middleware.

* this is added to policy routes' metrics layer.

see prometheus/client_rust#241 and prometheus/client_rust#242, which
track upstream proposals to add accessors to `Histogram` that will allow
us to make test assertions that metrics are working properly. for now,
these are feature gated as also done in #3308.

Signed-off-by: katelyn martin <kate@buoyant.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant