Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Prometheus Native Histograms #3987

Closed
wants to merge 4 commits into from

Conversation

fstab
Copy link
Contributor

@fstab fstab commented Jul 19, 2023

I am maintainer of the Prometheus Java client library, and we recently released the 1.0.0-alpha-1 version. This is a complete re-write of the Prometheus client library. The main goal is supporting the new Prometheus Native Histograms, but we are also using the opportunity to get rid of legacy code.

The Prometheus Java client library is a low-level metrics library, most users will use it through Micrometer or other abstractions. Our goal is to make the new library integrate well with Micrometer.

I would appreciate your feedback on the API before we mark the 1.0.0 release, and I'm happy to make changes to the API to make it work well with Micrometer.

This PR is an example of how the new Prometheus Java library could be used. It adds a new io.micrometer:micrometer-registry-prometheus_native registry as an alternative to the existing io.micrometer:micrometer-registry-prometheus registry. My goal is to make you aware and get feedback, not that the PR gets merged as-is.

I also created a complete end-to-end example:

The example application has a README showing how to build it, run it, and query a Prometheus native histogram.

There are a few subtle breaking changes related to the way Prometheus metric names are generated:

  • In Prometheus the _total suffix is reserved for counters. The new library removes the _total suffix for other metric types. As a result disk_total_bytes gets renamed to disk_bytes because it's a gauge.
  • In Prometheus the unit should be at the end of the metric name. The new library creates names by appending the unit. As a result, max values like http_server_requests_seconds_max get renamed to http_server_requests_max_seconds.
  • Micrometer has > 60 histogram buckets by default for classic histograms. For Prometheus this is a lot and can easily lead to cardinality explosion. The new library uses Prometheus default buckets {.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10} aligned with the default used in Prometheus client libraries for other programming languages.

As these are breaking changes, I added this as a new micrometer-registry-prometheus_native registry rather than updating the existing one.

Anyway, it would be great to hear from you, and I'm happy to adapt the API before releasing 1.0.0.

Signed-off-by: Fabian Stäber <fabian@fstab.de>
@sonatype-lift
Copy link
Contributor

sonatype-lift bot commented Jul 19, 2023

Sonatype Lift is retiring

Sonatype Lift will be retiring on Sep 12, 2023, with its analysis stopping on Aug 12, 2023. We understand that this news may come as a disappointment, and Sonatype is committed to helping you transition off it seamlessly. If you’d like to retain your data, please export your issues from the web console.
We are extremely grateful and thank you for your support over the years.

📖 Read about the impacts and timeline

@sonatype-lift
Copy link
Contributor

sonatype-lift bot commented Jul 19, 2023

🛠 Lift Auto-fix

Some of the Lift findings in this PR can be automatically fixed. You can download and apply these changes in your local project directory of your branch to review the suggestions before committing.1

# Download the patch
curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3987.diff -o lift-autofixes.diff

# Apply the patch with git
git apply lift-autofixes.diff

# Review the changes
git diff

Want it all in a single command? Open a terminal in your project's directory and copy and paste the following command:

curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3987.diff | git apply

Once you're satisfied, commit and push your changes in your project.

Footnotes

  1. You can preview the patch by opening the patch URL in the browser.

@@ -0,0 +1 @@
compatibleVersion=SKIP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the version number?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the API/ABI compatibility check. Since there isn't a previous version of this module to compare with, this configuration skips the compatibility check so the build doesn't fail.

@shaoxt
Copy link

shaoxt commented Feb 15, 2024

Can this PR got reviewed and merged?

Copy link
Member

@shakuzen shakuzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request. We're taking a closer look at it and the upgrade in general. I'm not sure the exact approach for the implementation we'll settle on, but I left a couple questions in the meantime.

Histogram.Builder builder = init(id, Histogram.newBuilder(prometheusProperties));
double[] classicBuckets = distributionStatisticConfig.getServiceLevelObjectiveBoundaries();
if (classicBuckets != null && classicBuckets.length == 0) {
builder = builder.classicOnly().withClassicBuckets(classicBuckets);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why use classicOnly if it's only adding the configured SLOs? Wouldn't adding those to the native histogram be better?


@SuppressWarnings("unchecked")
private <T extends MetricWithFixedMetadata.Builder<?, ?>> T init(Meter.Id id, T builder) {
return (T) builder.withName(PrometheusNaming.sanitizeMetricName(id.getName()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to use PrometheusNaming directly here (and other places) rather than make an implementation of Micrometer's NamingConvention that delegates to it? See PrometheusNamingConvention from the original module. Users will sometimes want to configure a custom naming convention for a MeterRegistry to use.

@@ -0,0 +1 @@
compatibleVersion=SKIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the API/ABI compatibility check. Since there isn't a previous version of this module to compare with, this configuration skips the compatibility check so the build doesn't fail.

jonatan-ivanov added a commit to jonatan-ivanov/micrometer that referenced this pull request Mar 11, 2024
Dependencies were upgraded to the 1.x versions of the Prometheus client
artifacts in micrometer-registry-prometheus.
This is a breaking change compared to the code that was there before.
This is unavoidable due to the breaking changes
in the Prometheus client from 0.x to 1.x.
The previous code from micrometer-registry-prometheus was moved to
a new module micrometer-registry-prometheus-simpleclient that
is deprecated but available to give users a backward compatible option
in case they need to make changes to use the new Prometheus client.

The base package used in micrometer-registry-prometheus was changed to
io.micrometer.prometheusmetrics to differentiate it from the
prior package now used in micrometer-registry-prometheus-simpleclient.
This avoids split packages and allows both modules to be used
in the same application even.

Tests were adapted to the new API and differences in the scrape output.
Tests involving exemplars are commented out until
exemplar support is added.

PrometheusHistogram's cumulative buckets were converted to delta in the
registry. The new Prometheus client wants delta buckets which is weird
since it converts them back to cumulative internally.

PrometheusNamingConvention behavior was changed, it does not append
"_total" to counters anymore. The new Prometheus client validates
Counter names and check if they end with "_total".
If they don't, it appends "_total". If they do, it throws an exception:
java.lang.IllegalArgumentException: 'cnt_total': Illegal metric name...
Because of this behavior, if we want to use PrometheusNamingConvention,
we need to modify it so that it does not append "_total".

Known issues in the micrometer-registry-prometheus module after this PR:
- Exemplars and Native Histograms do not work.
- VictoriaMetrics histograms are not supported.
- The OSGi test was failing when the new Prometheus client was used, so
it was updated to use the -simpleclient module for now.

See micrometer-metricsgh-3987
See micrometer-metricsgh-4406

Co-authored-by: Tommy Ludwig <8924140+shakuzen@users.noreply.github.com>
jonatan-ivanov added a commit that referenced this pull request Mar 11, 2024
Dependencies were upgraded to the 1.x versions of the Prometheus client
artifacts in micrometer-registry-prometheus.
This is a breaking change compared to the code that was there before.
This is unavoidable due to the breaking changes
in the Prometheus client from 0.x to 1.x.
The previous code from micrometer-registry-prometheus was moved to
a new module micrometer-registry-prometheus-simpleclient that
is deprecated but available to give users a backward compatible option
in case they need to make changes to use the new Prometheus client.

The base package used in micrometer-registry-prometheus was changed to
io.micrometer.prometheusmetrics to differentiate it from the
prior package now used in micrometer-registry-prometheus-simpleclient.
This avoids split packages and allows both modules to be used
in the same application even.

Tests were adapted to the new API and differences in the scrape output.
Tests involving exemplars are commented out until
exemplar support is added.

PrometheusHistogram's cumulative buckets were converted to delta in the
registry. The new Prometheus client wants delta buckets which is weird
since it converts them back to cumulative internally.

PrometheusNamingConvention behavior was changed, it does not append
"_total" to counters anymore. The new Prometheus client validates
Counter names and check if they end with "_total".
If they don't, it appends "_total". If they do, it throws an exception:
java.lang.IllegalArgumentException: 'cnt_total': Illegal metric name...
Because of this behavior, if we want to use PrometheusNamingConvention,
we need to modify it so that it does not append "_total".

Known issues in the micrometer-registry-prometheus module after this PR:
- Exemplars and Native Histograms do not work.
- VictoriaMetrics histograms are not supported.
- The OSGi test was failing when the new Prometheus client was used, so
it was updated to use the -simpleclient module for now.

See gh-3987
See gh-4406

Co-authored-by: Tommy Ludwig <8924140+shakuzen@users.noreply.github.com>
@jonatan-ivanov jonatan-ivanov added the superseded An issue that has been superseded by another label Apr 5, 2024
@shirapur-sid
Copy link

shirapur-sid commented Oct 22, 2024

Can you provide an update on the progress of this PR? We incorporated the Native histogram module from this repository and made necessary updates to the code based on the latest Prometheus client. After integrating these changes into our application, we have observed significant improvements in both accuracy and memory optimization. Is there any possibility that this PR will be merged? @fstab ?

@jonatan-ivanov
Copy link
Member

We added support for Prometheus Java Client 1.x separately that does not have native histogram support right now. I don't think we should add another Prometheus registry (what is in this PR) but instead add native histogram support to the existing registry. See: #4923

Also, since the native histogram has less buckets by default, could you please tell us what you did to achieve improvements in accuracy?

@shirapur-sid
Copy link

shirapur-sid commented Oct 30, 2024

ok, will look into it. @fstab do you have any plans to make the mentioned changes?
Actually native histogram has more buckets, It has a good data structure mechanism to group the multiple time series's into single time series . Here is the video on how native histogram metrics works -> https://www.youtube.com/watch?v=AcmABV6NCYk&t=587s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
superseded An issue that has been superseded by another
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support prometheus client 1.x
6 participants