Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Metric naming conventions #108

Merged
Merged
Changes from 19 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
a59a7fd
Proposal for metric naming conventions
tedpennings May 21, 2020
cece358
Add Node example metrics
tedpennings May 21, 2020
63443ef
Node.js instead of Node
tedpennings May 21, 2020
17b8993
Rename file, add Prometheus quote
tedpennings May 22, 2020
444cb79
Second round of revisions
tedpennings May 28, 2020
1727642
Working group feedback
tedpennings May 28, 2020
1ead372
More feedback changes
tedpennings May 28, 2020
6238766
Minor clarifications
tedpennings May 28, 2020
0914ded
Word choice
tedpennings May 28, 2020
509b6e5
Whitespace to check CLA status
tedpennings May 28, 2020
4316780
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
a9fdbfe
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
b60dfa7
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
3fb9258
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
17ab176
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
0ba3d76
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
5a31caf
Update text/metrics/0108-naming-conventions.md
tedpennings Jun 11, 2020
f2c990e
Code review feedback, remove discussion section
tedpennings Jun 11, 2020
2191122
Remove some discussion topics, and fix an example
tedpennings Jun 11, 2020
1d61a72
Merge branch 'master' into metric-naming-conventions
bogdandrutu Jul 10, 2020
8b88b36
Rename OTEP 108 to metric naming _guidelines_
justinfoote Jul 14, 2020
faa0138
Merge branch 'master' into metric-naming-conventions
yurishkuro Jul 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions text/metrics/0108-naming-conventions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Metric instrument naming conventions
jmacd marked this conversation as resolved.
Show resolved Hide resolved

## Purpose

Names and labels for metric instruments are primarily how humans interact with metric data -- users rely on these names to build dashboards and perform analysis. The names and hierarchical structure need to be understandable and discoverable during routine exploration -- and this becomes critical during incidents.
tedpennings marked this conversation as resolved.
Show resolved Hide resolved

To ensure these goals and consistency in future metric naming standards, this outlines a meta-standard for these names.

## Guidelines

Metric names and labels exist within a single universe and a single hierarchy. Metric names and labels MUST be considered within the universe of all existing metric names. When defining new metric names and labels, consider the prior art of existing standard metrics and metrics from frameworks/libraries.

Associated metrics SHOULD be nested together in a hierarchy based on their usage. Define a top-level hierarchy for common metric categories: for OS metrics, like CPU and network; for app runtimes, like GC internals. Libraries and frameworks should nest their metrics into a hierarchy as well. This aids in discovery and adhoc comparison. This allows a user to find similar metrics given a certain metric.

The hierarchical structure of metrics defines the namespacing. Supporting OpenTelemetry artifacts define the metric structures and hierarchies for some categories of metrics, and these can assist decisions when creating future metrics.

Common labels SHOULD be consistently named. This aids in discoverability and disambiguates similar labels to metric names.
jmacd marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Common labels SHOULD be consistently named

What does "common labels" mean? Common in what context/scope? A single service? An organization?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to avoid a debate over naming label keys. For example, you have a label key named "service" and have used it on some metrics, and I have a label key named "service" and have used it on some different metrics. How are we to know that those labels are not the same? The answer would be to add namespacing of labels. I recall the OpenCensus guidelines were to prefix your label names with a DNS prefix that you own. So I might have a lightstep.com/service label and you might have an uber.com/service label. For this to create a good user experience, I'd like the DNS prefix to not display by default. What would you like to see, @yurishkuro?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I read this guidance is: if we have a label that should be added to many different categories of metric instrument, and that label's semantic meaning is the same across all those categories, its name should be consistent.

The most obvious example I can think of would be status, whose value will be a CanonicalSpanStatus.

As a user, I would find it intuitive when searching my metrics in my UI to always find the success/failure information under the same status label.

I'm not sure I understand the example service label. Would it be the name of the service being instrumented? If so, perhaps we would want some semantic conventions around how to apply Resource attributes as metric labels.

If this is the case, I'm not sure if we need to change this line. Is this guidance not clear enough? What wording would make our meaning more clear?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to say we can merge this and debate this topic again as we modify the specification.

["As a rule of thumb, **aggregations** over all the dimensions of a given metric **SHOULD** be meaningful,"](https://prometheus.io/docs/practices/naming/#metric-names) as Prometheus recommends.

Semantic ambiguity SHOULD be avoided. Use prefixed metric names in cases where similar metrics have significantly different implementations across the breadth of all existing metrics. For example, every garbage collected runtime has slightly different strategies and measures. Using a single set of metric names for GC, not divided by the runtime, could create dissimilar comparisons and confusion for end users. (For example, prefer `runtime.java.gc*` over `runtime.gc.*`.) Measures of many operating system metrics are similar.
jmacd marked this conversation as resolved.
Show resolved Hide resolved

For conventional metrics or metrics that have their units included in OpenTelemetry metadata (eg `metric.WithUnit` in Go), SHOULD NOT include the units in the metric name. Units may be included when it provides additional meaning to the metric name. Metrics MUST, above all, be understandable and usable.