Metrics refresh #574

ivantopo · 2019-03-14T20:43:03Z

⚠️ Warning: This PR builds on top of #572 and should not be merged until that PR is merged.

Background and Motivations

Given that there are a number of breaking changes around context propagation in Kamon and we will have to release a non backwards compatible Kamon 2.0 version, it seemed like a great opportunity to spread some love through the entire code base, so the work on this PR started. There are a number of issues that sparked interesting on reworking the metrics API, most notably #546, although there are plenty more on the related issues section bellow.

Goals

Have a much friendlier metrics API with clear defined interfaces and differentiation between instruments.
Upgrade the metrics API to use the tags abstraction introduced on Introduce a common abstractions to handle tags #572.
Make sure that all APIs can be comfortably used from any JVM language.

Changes in this PR

Metrics and Instruments

Even though we did have different concepts for metrics and instruments and there isn't much of a big change there, now it is clearer that a metric is a group of instruments and all instruments for a given metric share the same settings. Furthermore, the settings are available through the .settings member on the metric and all instruments have a reference to the metric they belong to.

Metrics with Descriptions

We believe that having a proper description of what is being tracked by a metric can be of great help and furthermore, many of today's systems support showing a description for the metrics so we are allowing to provide a description for metrics. We will always expose a description, even if no description is provided by you when creating a metric.

Tagging instead of "refining"

We changed .refine(...) to .withTag(...) variants. At the time we thought that the word "refine" was a much better way to explain what we are doing but if we are using tags everywhere and this function is actually returning a new instrument with more tags, just calling it .withTag(...) seems like something way easier explain and to understand by users.

Also, now it is possible to apply tags directly on an instrument, which will add to/replace the tags on the instrument and return it, rather than having to go all the way to the base metric in order to get a particular instrument. A relatively common pattern on instrumentation was having a set of common tags for a metric that should be applied to all instruments and then "refine" for a particular case, which forced us to keep references to the maps used for tags because the only way to remove an instrument was going back to the base metric and calling .remove(...) with all the tags as shown bellow:

// Creating and refining a metric with Kamon 1.1
val CounterMetric = Kamon.counter("counter")
val commonTags = Map("commonTag" -> "value")
val instrumentOne = CounterMetric.refine(commonTags ++ Map("incarnation" -> "one"))
val instrumentTwo = CounterMetric.refine(commonTags ++ Map("incarnation" -> "two"))

// Removing Instruments
CounterMetric.remove(commonTags ++ Map("incarnation" -> "one"))
CounterMetric.remove(commonTags ++ Map("incarnation" -> "two"))

Since this PR, instruments know how to remove themselves from the registry and there is no need to reference the base metric at all:

// Creating and refining a metric with Kamon 2.0
val common = Kamon.counter("counter").withTag("commonTag", "value")
val instrumentOne = common.withTag("incarnation", "one")
val instrumentTwo = common.withTag("incarnation", "two")

// Removing instruments
instrumentOne.remove()
instrumentTwo.remove()

Finally, one thing that we had in the past was the ability to record values directly on a metric instance without refining, which would result in writing data on an instrument that has no tags. This allowed people to do things like this:

Kamon.counter("myCounter").increment()
Kamon.histogram("myHistogram").record(42L)

We had this because it seemed like many users would want to do that themselves and there was a considerable amount of boilerplate on the metrics implementation to allow it, but now that we are introducing a much clearer separation between metrics and instruments it didn't make sense to allow instrument operations on a metric so the "base" instruments (without tags) has to be requested explicitly now:

Kamon.counter("myCounter").withoutTags().increment()
Kamon.histogram("myHistogram").withoutTags().record(42L)

Gauges use Double instead of Long

Using Double instead of Long on gauges was a long standing issue (no pun intended) with the metrics API and we finally made the change. Even though we really don't recommend people using gauges, every now and then they are useful and when they are, having the ability to use floating points can come in handy (think of some ratio metrics or load average if it was being tracked with a gauge).

Auto-update all the Things

There was a request to allow providing a function that will automatically update a gauge, which was a clear need since long time ago but while implementing it, we realized that there are a few patterns that we use quite often like using a histogram as some sort of "sampler" so we periodically take samples of a value and store it in a Histogram (e.g. storing CPU usage on the system metrics module), or wanting to update a counter with a value coming from a cumulative counter from somewhere else on the JVM. Even we do auto-update internally with the range samplers so, instead of making auto update something specific to Guages we added it to all instruments, starting from this PR users can schedule auto-update actions on any instrument.

Furthermore, for the particular case of cumulative counters we added Counter.delta(...) function that makes it easier to create a producer of values for the auto-update action:

val counter = Kamon.counter("myCounter").withoutTags()
  .autoUpdate(delta(something.produceCumulativeValue())

Fluent APIs

Instruments return themselves or new instruments with calling most of their APIs so actions can be chained.

val counter = Kamon.counter("myCounter").withoutTags()
  .increment(42L)
  .withTag("auto", true)
  .autoUpdate(delta(something.produceCumulativeValue())

Metric and Instrument Inspection for Testing

We rewrote the previous inspection utilities and separated them into MetricInspection and InstrumentInspection with easy access via static members for Java users and implicit extensions for Scala users. From Java, trying to get the distribution of values in an instrument can look like this now:

import static kamon.testkit.InstrumentInspection.distribution;

Histogram histogram = Kamon.histogram("test").withoutTags();
distribution(histogram).count();

PeriodSnapshot changed

The PeriodSnapshot object which is sent to the metric reporters has been updated to reflect the changes in the structure. Also, there have been a number of updates to the PeriodSnapshot accumulator which now can expire entries after they have not been updated for more than a certain period of time.

Docs Docs Docs

We put some extra effort to make sure that all public APIs have a decent scaladoc that can help users discover how to use metrics and instruments.

Related Issues:

These are some of the issues that we expect to be solving with this PR:

Metric.refine method should return a Metric itself, to allow for easy chaining/wrapping #546 metric.refine should a metric itself to allow easy chaining/wrapping
Why did CurrentValueCollector get removed in new version? #568 why did CurrentValueCollector get removed?
Metrics are reported via kamon-prometheus even though they are removed #566 metrics are reported to Prometheus even after they are removed.
Ensure that all collected metrics are reported after removed #555 ensure that metrics are reported after they were removed.
allow providing additional tags on StartedTimer.stop #543 allow providing additional tags on StartedTimer.stop.
Allow creating automatically updated gauges #514 allow creating automatically updated gauges.
Support for floating point types are missing #519 Support for floating point types are missing.

ivantopo added the In Progress label Mar 14, 2019

ivantopo added 4 commits April 1, 2019 13:41

Rewrite the entire metrics API

0e42dd6

update the status page to match the new metrics model

7f21a96

remove DifferentialSource in favor of Counter.delta

e982b3d

separate SpanReporter into its own file

b8f43b5

ivantopo force-pushed the metrics-refresh branch from 5501358 to 14a52bf Compare April 1, 2019 11:41

ivantopo marked this pull request as ready for review April 1, 2019 11:42

ivantopo mentioned this pull request Apr 1, 2019

Tracer refresh #576

Merged

ensure range samplers get auto-updated

eaff45d

ivantopo force-pushed the metrics-refresh branch from 14a52bf to eaff45d Compare April 8, 2019 17:08

ivantopo added 2 commits April 8, 2019 21:22

use Double instead of Long values for gauges

57a7a9a

add and update documentation comments on the metric package

d8986bc

ivantopo merged commit 2d6556a into kamon-io:master Apr 8, 2019

sgabalda mentioned this pull request Aug 16, 2021

Metrics are reported via kamon-prometheus even though they are removed #566

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics refresh #574

Metrics refresh #574

ivantopo commented Mar 14, 2019 •

edited

Loading

Metrics refresh #574

Metrics refresh #574

Conversation

ivantopo commented Mar 14, 2019 • edited Loading

Background and Motivations

Goals

Changes in this PR

Metrics and Instruments

Metrics with Descriptions

Tagging instead of "refining"

Gauges use Double instead of Long

Auto-update all the Things

Fluent APIs

Metric and Instrument Inspection for Testing

PeriodSnapshot changed

Docs Docs Docs

Related Issues:

ivantopo commented Mar 14, 2019 •

edited

Loading