Use metric-schema for instrumentation metrics #1962

pkoenig10 · 2024-06-13T16:08:56Z

Before this PR

The tagged metrics produced by instrumented classes are cumbersome to use because each instrumented class uses a unique metric name. These metrics names are the fully qualified class names, which can be quite long making them difficult to use in metric query UIs.

It is also particularly cumbersome given our internal metrics filtering, since metrics for newly instrumented classes won't be available until they are used in a dashboard.

After this PR

Instrumentation metrics are defined in metric-schema. There is a single instrumentation metric and the interface name, method name, and result are tags.

Possible downsides?

Dashboards will need to be updated. I don't think users tend to have these instrumentation metrics on dashboards - they are more commonly used for one-off investigations.

changelog-app · 2024-06-13T16:09:00Z

Generate changelog in `changelog/@unreleased`

What do the change types mean?

feature: A new feature of the service.
improvement: An incremental improvement in the functionality or operation of the service.
fix: Remedies the incorrect behaviour of a component of the service in a backwards-compatible way.
break: Has the potential to break consumers of this service's API, inclusive of both Palantir services
and external consumers of the service's API (e.g. customer-written software or integrations).
deprecation: Advertises the intention to remove service functionality without any change to the
operation of the service itself.
manualTask: Requires the possibility of manual intervention (running a script, eyeballing configuration,
performing database surgery, ...) at the time of upgrade for it to succeed.
migration: A fully automatic upgrade migration task with no engineer input required.

Note: only one type should be chosen.

How are new versions calculated?

❗The break and manual task changelog types will result in a major release!
🐛 The fix changelog type will result in a minor release in most cases, and a patch release version for patch branches. This behaviour is configurable in autorelease.
✨ All others will result in a minor version release.

Type

Description

Instrumentation metrics are now defined using metric-schema. There is a single metric name for all instrumented classes with tags for the service name, endpoint, and result.

Check the box to generate changelog(s)

Generate changelog entry

pkoenig10 · 2024-06-13T16:10:00Z

tritium-lib/src/main/java/com/palantir/tritium/proxy/Instrumentation.java

-            String serviceName = Strings.isNullOrEmpty(prefix) ? interfaceClass.getName() : prefix;
-            this.handlers.add(new TaggedMetricsServiceInvocationEventHandler(metricRegistry, serviceName));
+            this.handlers.add(new TaggedMetricsServiceInvocationEventHandler(
+                    metricRegistry, Strings.isNullOrEmpty(serviceName) ? interfaceClass.getSimpleName() : serviceName));


I've intentionally changed the default service name from interfaceClass.getName() to interfaceClass.getSimpleName() to more closely match the behavior of similar instrumentation elsewhere, such as the Conjure endpoint instrumentation.

pkoenig10 · 2024-06-13T16:10:50Z

tritium-lib/src/test/java/com/palantir/tritium/TritiumTest.java

-
-@ExtendWith(SystemStubsExtension.class)
-@SuppressWarnings("NullAway")
-public class TritiumTest {


These tests seem duplicative, so I've removed them.

schlosna · 2024-06-14T17:19:49Z

Directionally I'm supportive of this. @pkoenig10 do you have a sense of the migration lift for shifting from existing to new metric names & tags for monitors & queries?

pkoenig10 · 2024-06-14T18:40:51Z

It will definitely be manual work for dashboards.

I'm fairly confident that no one is using these metrics in monitors. Doing so would be fairly brittle, as the metric names could change after a refactor.

Searching internally I don't see any monitor definitions that appear to be using one of these metrics. It's a bit difficult since there are a number of legacy metrics that use class names, but which aren't these instrumented metrics. I used the following Sourcegraph query (case-sensitive and regex):

metric.*com\.palantir\..*\.[A-Z].+\.\w.+ and endpoint and file:.*\.hype

svc-autorelease · 2024-06-15T12:45:40Z

Released 0.89.0

Use metric-schema for instrumentation metrics

222b810

pkoenig10 requested review from schlosna and carterkozak June 13, 2024 16:08

probot-autolabeler bot added the autorelease label Jun 13, 2024

pkoenig10 commented Jun 13, 2024

View reviewed changes

Add generated changelog entries

9769758

pkoenig10 added the merge when ready label Jun 13, 2024

schlosna approved these changes Jun 15, 2024

View reviewed changes

bulldozer-bot bot merged commit 371e61b into develop Jun 15, 2024
5 checks passed

bulldozer-bot bot deleted the pkoenig/instrumentationMetrics branch June 15, 2024 12:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use metric-schema for instrumentation metrics #1962

Use metric-schema for instrumentation metrics #1962

pkoenig10 commented Jun 13, 2024 •

edited

Loading

changelog-app bot commented Jun 13, 2024 •

edited by pkoenig10

Loading

pkoenig10 Jun 13, 2024

pkoenig10 Jun 13, 2024 •

edited

Loading

schlosna commented Jun 14, 2024

pkoenig10 commented Jun 14, 2024 •

edited

Loading

svc-autorelease commented Jun 15, 2024

Use metric-schema for instrumentation metrics #1962

Use metric-schema for instrumentation metrics #1962

Conversation

pkoenig10 commented Jun 13, 2024 • edited Loading

Before this PR

After this PR

Possible downsides?

changelog-app bot commented Jun 13, 2024 • edited by pkoenig10 Loading

Generate changelog in changelog/@unreleased

pkoenig10 Jun 13, 2024

Choose a reason for hiding this comment

pkoenig10 Jun 13, 2024 • edited Loading

Choose a reason for hiding this comment

schlosna commented Jun 14, 2024

pkoenig10 commented Jun 14, 2024 • edited Loading

svc-autorelease commented Jun 15, 2024

pkoenig10 commented Jun 13, 2024 •

edited

Loading

changelog-app bot commented Jun 13, 2024 •

edited by pkoenig10

Loading

Generate changelog in `changelog/@unreleased`

pkoenig10 Jun 13, 2024 •

edited

Loading

pkoenig10 commented Jun 14, 2024 •

edited

Loading