Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MP Metrics 5.1 support #9032

Merged
merged 9 commits into from
Jul 31, 2024
Merged

Conversation

tjquinno
Copy link
Member

Description

Resolves #8004

Notable changes

  1. Metrics config setting gc-time-type (new feature)

    The MP Metrics 5.1 TCK fixes a problem in a TCK test which used to check for the gc.time being a counter; the spec always said it is a gauge. In the 5.1 TCK the test now checks for a gauge by examining the Prometheus output to verify that metric, and the format is different for counters and gauges.

    Helidon implements gc.time in SE as a system-provided meter. To maintain backward compatibility with Helidon 4.0.x while also passing the MP Metrics 5.1 TCK, Helidon adds metrics.gc-time-type which defaults to counter (the old behavior) but can also be set to gauge to pass the TCK or for user services to comply with MP Metrics 5.1.

    The setting is marked as deprecated; Helidon 5 should remove the option entirely, always using a gauge.

  2. Default percentile precision (bug fix)

    Micrometer allows users to set the precision with which percentiles are computed. Since 4.0.0 Helidon has defaulted the precision to 3, but the code imposed this default in the MDistributionStatisticsConfig#build method which could incorrectly overwrite a user-assigned value. The PR moves the default assignment to the builder's constructor so a user setting will overwrite the default instead of the other way around.

  3. MDistributionStatisticsConfig delegation (bug fix)

    Most of our metrics implementation classes have a delegate which points to the corresponding Micrometer object. Although
    Micrometer does have a DistributionStatisticConfig.Builder type it is hidden inside the Micrometer DistributionSummary.Builder and we cannot access it. Instead, callers indirectly update the Micrometer DistributionStatisticConfig.Builder by invoking methods on the DistributionSummary.Buider which delegates to its hidden config.

    As a result, our MDistributionStatisticsConfig and its Builder classes cannot actually make use of normal Micrometer delegates. To satisfy the Helidon metrics API these types now keep a copy of the relevant values (min. and max. expected values and percentile and bucket boundary settings) that are also stored in their Micrometer counterparts.

    Our MDistributionStatisticsConfig.Builder type now delegates to the corresponding DistributionSummary.Builder because that is where the setter methods are for the Micrometer distribution summary config settings.

    Relatedly, there are some cases in which we need a MDistributionStatisticsConfig or Builder but there is no corresponding Micrometer counterpart. For those, the PR adds the MDistributionStatisticsConfig.Unconnected inner class and its builder.

  4. Prometheus output (bug fix)

    To create Prometheus output, Helidon's Micrometer-based provider leverages Micrometer's own ability to prepare Prometheus exposition format from a Micrometer Prometheus meter registry. For multi-valued meters such as timers and distribution summaries (histograms), the Micrometer Prometheus meter registry stores multiple "sub-meters" with suffixes such as _total, _count, _sum, and _bucket. To extract that data efficiently for our Prometheus output we compute all the meter and sub-meter names we might want and pass those to the Micrometer Prometheus meter registry. This calculation had incorrectly omitted the _bucket sub-meters for distribution summaries.

  5. Expose a way to set whether to publish percentile histograms

    Micrometer, and an optional part of the MP Metrics spec 5.1, allow users to indicate whether or not to publish percentile histograms. The original Helidon metrics API did not expose this, and the optional section of the spec relies on it.

    This change affected the Timer and DistributionSummary builder types and their implementations.

  6. Configuring percentiles and buckets (new MP-only feature)

    MP Metrics 5.1 allows users to configure the percentiles and/or bucket boundaries to be used for timers and distribution summaries, formatted as a semi-colon-separated list of meter-name-expression=values-list where values-list is a comma-separated list of values. The MP Metrics spec configuration section has details. The new Helidon class DistributionCustomizations encapsulates the required config processing and exposes a simple internal API to apply defaults as needed to builders for timers and distribution summaries.

    The previously-existing places in the MP-only Helidon Registry class where timer and distr. summary builders are created (also in HelidonHistogram and HelidonTimers) now also apply any matching configured defaults to those builders when the builder is created.

    This PR adds support for this only for MP, not also SE. The MP Metrics spec is unlikely to evolve further. MP is emphasizing OTel more and more and the MP Telemetry spec relies on OTel config settings rather than adding lots of its own config settings. It did not seem prudent to add this MP Metrics-specific feature to SE at this time.

    The spec contains an optional portion which this PR also implements (again, MP only).

  7. MetricsCdiExtension config property handling (bug fix)

    Certain config settings in the mp.metrics namespace need to be reflected in the SE config space so SE metrics behaves correctly. In moving from Helidon 3.x to 4.0, the metrics appName key changed to app-name The code in the MP metrics extension which handles the mapping of MP to SE config was not changed accordingly. At this point we need to keep the existing 4.x SE config key so this code in the extension now allows for differing key names in the two config spaces.

  8. RegistryProducer (spec change)

    MP Metrics 5.0 deprecated the RegistryType concept (fixed values of base, vendor, and application) and the RegistryType annotation in favor of RegistryScope (base, vendor, and application are predefined scopes but users can add their own). In 5.0, though, the RegistryScope annotation was not marked as a qualifier. That has now changed so our provider code now responds accordingly, while continuing to support the deprecated RegistryType annotation.

  9. TCK driver (compatibility)

    To pass the TCK, our TCK driver needs to specify the new metrics.gc-time-type property as gauge. The pom.xml now sets that as a Java system property in the surefire plug-in set-up so it is visible to config.

  10. dependencies/pom.xml

    Now declares our dependency on MP metrics 5.1.1.

Documentation

The PR includes doc changes:

  1. Metrics config doc (SE and MP) for choosing how Helidon implements gc.time.
  2. Updates to the generated MetricsConfig .adoc file.

@tjquinno tjquinno self-assigned this Jul 23, 2024
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jul 23, 2024
@tjquinno tjquinno marked this pull request as draft July 24, 2024 15:18
@tjquinno
Copy link
Member Author

I marked this as a draft PR until a bug in config metadata generation re: enums is fixed.

@tjquinno tjquinno marked this pull request as ready for review July 24, 2024 15:53
@tjquinno
Copy link
Member Author

Marked as ready for review again. The fix to #9036 should also include fix-ups to the generated metadata and the generated .adoc file for MetricsConfig.

@@ -131,6 +131,10 @@
<systemPropertyVariables>
<context.root/> <!-- Suppress the TCK's default of /optionalTCK -->
<metrics.rest-request.enabled>true</metrics.rest-request.enabled> <!-- off by default -->
<!-- TODO - remove setting of gc-time-type for Helidon 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use a different way to mark things to be removed in 5?
For example @Deprecated(forRemoval = true) , so we can just search for this string.
TODO means that we missed something...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

I guess to me, TODO means exactly that--we know something needs to be done in the future. Yes, sometimes people use it to flag omitted work that should already have been done, but that is not true of the TODOs here and I believe the text in each clearly indicates why they are there.

But I'll go ahead and change the TODOs for removal in 5.x to commented annotations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have a todo check in checkstyle, but it does not catch all the possible combination people use...

@tomas-langer tomas-langer added this to the 4.1.0 milestone Jul 30, 2024
@tjquinno tjquinno merged commit b17941d into helidon-io:main Jul 31, 2024
12 checks passed
@tjquinno tjquinno deleted the 4.x-adopt-mp-metrics-5.1-a branch July 31, 2024 10:40
@tjquinno tjquinno requested a review from barchetta August 3, 2024 20:58
@tjquinno tjquinno linked an issue Aug 7, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MicroProfile Metrics 5.1 4.x Track MP metrics TCK bug fix
2 participants