TaggedMetricRegistry composability #81

j-baker · 2018-06-18T17:33:12Z

Frequently libraries produce metrics, but the context of usage is not known (e.g. Atlas metrics are coupled to an Atlas client, Spark metrics are tied to a job). In this case, if a caller has multiple metric registries, they can add the child metrics to the parent with a tag, so that they can be distinguished.

This is a very quick thing to throw up, but basically internally we have a bunch of producers of the same set of metrics, and we'd like to be able to differentiate them (with tags). With this PR, we'd be able to have multiple metric registries, and merge them into a single metric registry. Please don't merge, since there are a few things I need to think through, but wanted to make sure that this doesn't sound super unreasonable.

j-baker · 2018-06-18T17:36:01Z

...-registry/src/main/java/com/palantir/tritium/metrics/registry/DropwizardTaggedMetricSet.java

+
+    @Override
+    public Map<MetricName, Metric> getMetrics() {
+        return metricRegistry.getMetrics().entrySet().stream()


I could avoid the copy here if we think it's important by implementing ForwardingMap.

j-baker · 2018-06-18T17:37:51Z

...egistry/src/main/java/com/palantir/tritium/metrics/registry/DefaultTaggedMetricRegistry.java

@@ -36,6 +36,7 @@
    private static final TaggedMetricRegistry DEFAULT = new DefaultTaggedMetricRegistry();

    private final Map<MetricName, Metric> registry = new ConcurrentHashMap<>();
+    private final Map<Map<String, String>, TaggedMetricSet> taggedRegistries = new ConcurrentHashMap<>();


probably not quite right, since will want to shove probably two MetricSets in for each AtlasDB, also probably shouldn't be passing in an empty map here. Maybe just a Set of Pair type structure is the right thing here.

Yeah, this feels odd and I worry about someone putting a mutable Map<String, String> in as a key and causing all kinds of chaos.

schlosna

Hey James, interested to see how you're planning on using this, maybe sketch out some tests showing intent? I'm wondering if we want to have a single canonical metric registry that other partitions can feed into and auto adjust MetricName to include that partition's tags.

For example if we have service Foo with a single registry, and Atlas clients Bar and Baz, we'd inject a registry that auto adds tag client=Bar and client=Baz respectively, Atlas adds a counter qux and the underlying Foo registry has two counters qux;client=Bar and qux;client=Baz.

schlosna · 2018-06-18T18:06:32Z

...egistry/src/main/java/com/palantir/tritium/metrics/registry/DefaultTaggedMetricRegistry.java

@@ -36,6 +36,7 @@
    private static final TaggedMetricRegistry DEFAULT = new DefaultTaggedMetricRegistry();

    private final Map<MetricName, Metric> registry = new ConcurrentHashMap<>();
+    private final Map<Map<String, String>, TaggedMetricSet> taggedRegistries = new ConcurrentHashMap<>();


Yeah, this feels odd and I worry about someone putting a mutable Map<String, String> in as a key and causing all kinds of chaos.

schlosna · 2018-06-18T18:09:20Z

...-registry/src/main/java/com/palantir/tritium/metrics/registry/DropwizardTaggedMetricSet.java

+    @Override
+    public Map<MetricName, Metric> getMetrics() {
+        return metricRegistry.getMetrics().entrySet().stream()
+                .collect(toMap(entry -> MetricName.builder().safeName(entry.getKey()).build(), Map.Entry::getValue));


are we ok to assume all metric names are safe here?

yes - they'll get logged internally in the same way as they currently do. The reason this was called safe name was to make clear that they have to be safe - but that's a baseline requirement.

schlosna · 2018-06-18T18:10:14Z

tritium-registry/src/main/java/com/palantir/tritium/metrics/registry/TaggedMetricRegistry.java

-     *
-     * @return map of registered metrics
-     */
-    Map<MetricName, Metric> getMetrics();


I haven't looked at how painful the API break would be, do you have a sense?

that's not actually an API break; it's just moved into the super interface

j-baker · 2018-06-25T17:29:00Z

hey dave - the motivation here is basically: we have multiple atlas clients and want to be able to control how they're reported.

I don't think libraries should be using static metrics. This is for a few reasons, but I've grown conviction in this workflow over time. It's painful for producers because they have to understand the metrics with regards to a lifecycle. For example, with AtlasDB they have to go through the workflows of removing metrics upon failed initialisation tasks, so as to not leave now-unused gauges around, and the complexity of this is nontrivial.

For consumers, it's painful because now the metrics that are being recorded are defined by the producer. For example, AtlasDB made the bad assumption that no one has more than one AtlasDB hanging around, and this was very very painful to change (2000 lines +-), because the metrics needed to be plumbed through (even though it was only like 100 lines of additions overall).

It's also painful for aggregators, since it makes the incremental cost of adding metrics almost zero, which means that tracking which metrics exist and are useful is essentially impossible. For example, AtlasDB by default adds around 1500 metrics. This makes it very difficult to have a coherent metric reporting story (which metrics are even valuable? which should I use?) because they're so numerous.

schlosna

Hey James, makes sense and agreed AtlasDB should have been injecting the appropriately scoped metrics registry from the beginning.

Had one minor API question before merging.

schlosna · 2018-06-25T17:40:10Z

tritium-registry/src/main/java/com/palantir/tritium/metrics/registry/TaggedMetricRegistry.java

+    /**
+     * Removes a TaggedMetricsSet added via addMetrics from this metrics set.
+     */
+    void removeMetrics(String safeTagName, String safeTagValue);


API question: should we return either a boolean letting consumer know if it was removed, or the removed TaggedMetricSet?

I now return an optional taggedmetricset to be consistent

with remove()

schlosna · 2018-06-25T18:39:08Z

...egistry/src/main/java/com/palantir/tritium/metrics/registry/DefaultTaggedMetricRegistry.java

+                        entry.getKey().getKey(),
+                        entry.getKey().getValue(),
+                        entry.getValue().getMetrics().entrySet().stream())))
+                .collect(toMap(Map.Entry::getKey, Map.Entry::getValue, (a, b) -> a));


you got checkstyled

ParameterNameCheck - com.palantir.tritium.metrics.registry.DefaultTaggedMetricRegistry - checkstyleMain DefaultTaggedMetricRegistry.java:105: Parameter name 'b' must match pattern '^[a-z][a-zA-Z0-9]+$'. ParameterNameCheck - com.palantir.tritium.metrics.registry.DefaultTaggedMetricRegistry - checkstyleMain DefaultTaggedMetricRegistry.java:105: Parameter name 'a' must match pattern '^[a-z][a-zA-Z0-9]+$'.

schlosna · 2018-06-25T18:39:37Z

Approved pending checkstyle fix

schlosna · 2018-06-25T19:25:49Z

@j-baker go ahead and update the description when you're happy to squash & merge

jiahuijiang · 2018-07-06T20:16:22Z

@j-baker Why do we only allow one tag for each metricSet here not a map of tags and their values?

j-baker commented Jun 18, 2018

View reviewed changes

schlosna requested changes Jun 19, 2018

View reviewed changes

add test

4b29c7e

Checkstyle

6a79d11

schlosna reviewed Jun 25, 2018

View reviewed changes

j-baker added 2 commits June 25, 2018 19:28

PR comments

2cc1b75

PR comments

b36b363

schlosna reviewed Jun 25, 2018

View reviewed changes

schlosna approved these changes Jun 25, 2018

View reviewed changes

j-baker added 4 commits June 25, 2018 19:42

fixups

3eb3645

dead code

5aeaf63

unused import

859eb7b

guava

cf44358

schlosna merged commit 586e66c into palantir:develop Jun 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TaggedMetricRegistry composability #81

TaggedMetricRegistry composability #81

j-baker commented Jun 18, 2018 •

edited

Loading

j-baker Jun 18, 2018

j-baker Jun 18, 2018

schlosna Jun 18, 2018

schlosna left a comment

schlosna Jun 18, 2018

schlosna Jun 18, 2018

j-baker Jun 25, 2018

schlosna Jun 18, 2018

j-baker Jun 25, 2018

j-baker commented Jun 25, 2018 •

edited

Loading

schlosna left a comment

schlosna Jun 25, 2018

j-baker Jun 25, 2018

j-baker Jun 25, 2018

schlosna Jun 25, 2018

schlosna commented Jun 25, 2018

schlosna commented Jun 25, 2018

jiahuijiang commented Jul 6, 2018

TaggedMetricRegistry composability #81

TaggedMetricRegistry composability #81

Conversation

j-baker commented Jun 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schlosna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

j-baker commented Jun 25, 2018 • edited Loading

schlosna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schlosna commented Jun 25, 2018

schlosna commented Jun 25, 2018

jiahuijiang commented Jul 6, 2018

j-baker commented Jun 18, 2018 •

edited

Loading

j-baker commented Jun 25, 2018 •

edited

Loading