-
Notifications
You must be signed in to change notification settings - Fork 164
New metric instruments from OTEP 88 #93
Changes from all commits
66d54c4
d7232df
f9892cd
0e57951
2000d81
077dc25
8af170b
d9793af
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,157 @@ | ||
# List of new metric instruments | ||
|
||
Formalize the new metric instruments proposed in [OTEP 88](https://github.com/open-telemetry/oteps/pull/88). | ||
|
||
## Motivation | ||
|
||
OTEP 88 introduced a framework for reasoning about new metric | ||
instruments with various refinements and ended with a [sample | ||
proposal](https://github.com/open-telemetry/oteps/pull/88#sample-proposal). | ||
This proposal uses that proposal as a starting point. | ||
|
||
Note that this proposal is meant to establish the set of standard | ||
instruments in terms of their refinements. This proposal raised | ||
several open questions about naming and default aggregations that were | ||
best addressed in a separate OTEP. See [OTEP | ||
96](https://github.com/open-telemetry/oteps/pull/96) and consider the | ||
names used in this proposal to be provisional--OTEP 96 proposes a | ||
final naming scheme with greater consistency. | ||
|
||
## Explanation | ||
|
||
The four instrument refinements discussed in OTEP 88 are: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like this document has identified 2 different categories of refinements:
Are there any other that people see? Did we want to explicitly include these categories in this OTEP? |
||
|
||
* Sum-only: When computing only a sum is the instrument's primary purpose | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For consistency, should this explicitly say that only a sum aggregation is to be used for instruments of this kind of refinement. |
||
* Non-negative: When negative values are invalid | ||
* Precomputed-sum: When the application has computed a cumulative sum itself | ||
* Non-negative-rate: When a negative rate is invalid. | ||
jmacd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
These refinements are not for exposing directly to users at the API | ||
level. These concepts are purely explanatory, used to define the | ||
properties of the metric instruments presented in the API. Following | ||
OTEP 88: | ||
|
||
* Users will select instruments based on their specified properties | ||
* Instruments are associated with a Descriptor, that includes the instrument kind (an enumeration) | ||
* Exported metric events include the instrument Descriptor, allowing exporters to interpret event values. | ||
|
||
In other words, these refinements serve to define the set of | ||
instruments. Both users and exporters will deal in concrete kinds of | ||
instrument, these refinements are just for explaining their | ||
properties. | ||
|
||
OTEP 88 describes how we are meant to compute rate information from | ||
instruments having the Non-negative-rate refinement. Temporal | ||
aggregation (over time) must be treated as a special case, compared | ||
with spatial aggregation (over labels). The logic for computing rates | ||
depends on whether the Precomputed-sum refinement is present or not, | ||
which determines whether Delta or Cumulative values are being | ||
captured. | ||
|
||
OTEP 88 proposes that when adding new instruments, we specify | ||
instruments having a single purpose, each with a distinct set of | ||
refinements, and each with a carefully selected name for the | ||
properties of the instrument. | ||
|
||
OTEP 88 also also proposes to support language-specific specialization | ||
as well, to support built-in value types (e.g., timestamps). | ||
|
||
### Delta, Cumulative, and Instantaneous instruments | ||
|
||
Instruments can be categorized as to whether their values are "Delta", | ||
"Cumulative", or "Instantaneous". This depends on which refinements | ||
are present and is required knowledge for exporters of the Sum-only | ||
instruments, since they may need to convert back and forth between | ||
delta-values and cumulative-values. The categorization is determined | ||
as follows: | ||
|
||
Instruments with the Sum-only refinement, but without the | ||
Precomputed-sum refinement, produce delta instruments. | ||
|
||
Instruments with the Sum-only and Precomputed-sum refinements produce | ||
cumulative instruments. | ||
|
||
Instruments without the Sum-only refinement produce instantaneous | ||
measurements. | ||
|
||
## Internal details | ||
|
||
The existing specification includes three instruments: Counter, | ||
Measure, and Observer. In this proposal, the two foundational | ||
instruments, Measure and Observer (synchronous and asynchronous, | ||
respectively), become abstract names, in the sense that all | ||
synchronous instruments are instances of a Measure instrument and all | ||
asynchronous instruments are instances of an Observer instrument. New | ||
instrument names are given to the unrestricted, unrefined versions of | ||
the foundational instruments. The following table summarizes this | ||
proposal: | ||
|
||
| Existing name | Proposed name | Sync or Async | Refinements | Measurement kind | | ||
| ------------- | ------------- | ------------- | ----------- | ---------------- | | ||
| Counter | Counter | Sync | Sum-only, Non-negative, Non-negative-rate | Delta | | ||
| | UpDownCounter | Sync | Sum-only | Delta | | ||
| Measure | Distribution | Sync | _None_ | Instantaneous | | ||
| | Timing | Sync | Non-negative, correct Duration units | Instantaneous | | ||
| Observer | LastValueObserver | Async | _None_ | Instantaneous | | ||
| | DeltaObserver | Async | Sum-only, Non-negative, Non-negative-rate | Delta | | ||
| | CumulativeObserver | Async | Sum-only, Precomputed-sum, Non-negative-rate | Cumulative | | ||
|
||
### Synchronous instruments | ||
|
||
Two new synchronous instruments are introduced in this proposal, bringing the total to four. | ||
|
||
1. **Counter** remains unchanged. It uses Sum aggregation by default. | ||
2. **Distribution** is an an unrefined Measure instrument. Distribution accepts positive or negative values and uses MinMaxSumCount aggregation by default. | ||
3. **UpDownCounter** is a Sum-only instrument with no other refinements. It supports capturing positive and negative changes to a sum (deltas). UpDownCounter uses Sum aggregation by default. | ||
jmacd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
4. **Timing** is a Non-negative instrument specialized for the native clock duration measured on the platform. It ensures that duration values are always captured with correct units, that ensures exporters can convert duration measurements correctly. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I 100% think we should include this as an instrument, but having a unit based refinement wasn't specified at the start of the OTEP. Can we add something there about this? |
||
|
||
### Asynchronous instruments | ||
|
||
Two new asynchronous instruments are introduced in this proposal, bringing the total to three. | ||
|
||
1. **LastValueObserver** is an unrefined Observer instrument. LastValueObserver accepts positive or negative values and uses MinMaxSumCount aggregation by default. | ||
2. **DeltaObserver** is a Sum-only, Non-negative, Non-negative-rate instrument useful for capturing deltas that accumulate during a collection interval. | ||
3. **CumulativeObserver** is a Sum-only, Precomputed-sum, Non-negative, Non-negative-rate instrument useful when reporting precomputed sums. | ||
|
||
Both new asynchronous instruments are meant to be used for aggregating rate information from a callback. | ||
|
||
### Instruments not specified | ||
|
||
This proposal brings the number of specified instruments to seven and leaves room for more instruments to be added in the future. As discussed in OTEP 88, other possibilities THAT WE DO NOT propose standardizing include: | ||
|
||
1. **CumulativeCounter** would be as synchronous instrument for reporting a cumulative value with the Non-negative-rate refinement. | ||
2. **UpDownCumulativeCounter** would be as synchronous instrument for reporting a cumulative value with the Non-negative-rate refinement. | ||
3. **AbsoluteDistribution** would be a synchronous instrument for reporting a distribution of non-negative values. | ||
4. **UpDownDeltaObserver** would be an asynchronous instrument for reporting positive and negative deltas to a sum. | ||
5. **UpDownCumulativeObserver** would be an asynchronous instrument for reporting a cumulative sum without a Non-negative-rate refinement. | ||
6. **AbsoluteLastValueObserver** would be an asynchronous instrument for reporting non-negative values. | ||
|
||
These could be standardized in the future if there is sufficient | ||
demand. Although not standard, the behavior of each of these | ||
instruments can be obtained by configuring one of the standard | ||
instruments with non-standard aggregation. We will wait and see. | ||
|
||
## Trade-offs and mitigations | ||
|
||
There are known limitations caused by not standardizing all possible | ||
instrument refinements. Creating too many instruments will create | ||
confusion of its own, so we choose to limit the set of standard | ||
instruments. It is possible that SDK support for configuring | ||
alternate aggregations will avoid the need for more standard | ||
instruments. | ||
|
||
There are potential incompatibilities related to the input range of | ||
existing exporters and the new instruments. For example, a Prometheus | ||
Histogram is used to capture a distribution with non-negative values. | ||
This proposal does not specify a standard AbsoluteDistribution | ||
instrument, which has the corresponding input-range restriction. | ||
|
||
We are left recommending that Prometheus Histogram users adopt an | ||
OpenTelemetry Distribution and continue to capture non-negative | ||
values. Non-negative values will be reported correctly, the only | ||
behavioral difference relates to error handling. Whereas a Prometheus | ||
Histogram generates an error for negative inputs, the OpenTelemetry | ||
Distribution accepts negative inputs. A Prometheus exporter could be | ||
configured to work around this (e.g., by reporting negative | ||
distributions seprately), but [metric events that were correct in the | ||
original system will continue to be correct in OpenTelemetry](https://github.com/open-telemetry/oteps/pull/88#discussion_r404912359). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.