Merge descriptor type and data points into Data. #202

bogdandrutu · 2020-08-18T23:57:23Z

This PR does not change any semantics, it just re-organize the protos:

Removes MetricDescriptor and adds fields to the Metric directly
Combines MetricDescriptor.Type with specific DataPoints.

This has the following advantages:

Require only one allocation more than the initial version (v0.4.0). When using gogo proto.
In-memory representation is smaller than the initial version (v0.4.0).
Extensible: allows to add new data types in the Metric.
Groups together metadata and points about different aggregation types.
Removes possibility for receiving multiple data points types, which was an undefined behavior.

Fixes #189

Signed-off-by: Bogdan Drutu bogdandrutu@gmail.com

opentelemetry/proto/metrics/v1/metrics.proto

bogdandrutu · 2020-08-19T00:21:44Z

@tigrannajaryan in the Attributes you decided to use oneof instead of the list of all possibilities, I did the same here because most likely we will have a list of more than 5 items as possibilities for the data type, and I remember that you said is ok.

tigrannajaryan · 2020-08-19T14:58:31Z

opentelemetry/proto/metrics/v1/metrics.proto

-  oneof type {
-    // TODO: Determine if encoding all possible values in a uint64 bitset
-    // improves performance significantly and propose that if that is the case.
+  oneof data {


If we are paying the cost of oneof here I would suggest to split Gauge into 2 separate messages, one per data type and add both to this oneof, i.e.:

oneof data { GaugeIn64 gauge_int = 4; GaugeDouble gauge_double = 5; ... }

Same for Sum.

By doing so we will eliminate the need to store the empty unused slice for each Gauge and each Sum. It will reduce memory usage.

You can likely also eliminate MeasurementValueType. It is redundant since that information will be encoded in oneof.

Some thoughts here:

MeasurementValueType cannot be completely removed because still needed in places like Histogram, unless we double that type as well. There may be other aggregations like that that will not have to expose different types based on the measurement value type (e.g. Summary that will most likely be re-added).

I do agree that having a GaugeInt/GaugeDouble is simple enough, for Sum I am a bit worried to duplicate all the fields.

In general case we will have >5 data points per metrics (I remember this from a previous backend that I used), so indeed we may argue that splitting into int/double data points may save some memory for the cost of maintenance of duplicate code.

Another option is to merge #172 and simply don't care about the in-memory extra 8 bytes per point (no effect on the marshal/unmarshal or network size).

Any ideas?

My preference would be to simplify the model and merge #172 and keep the current MeasurementValueType as part of the Sum/Gauge/Histogram (not in the metric because it may not apply to some aggregation)

MeasurementValueType cannot be completely removed

Correct. What I wanted to say is to remove the measurement_value_type field from Gauge/GaugeInt, not remove the MeasurementValueType enum definition.

I do agree that having a GaugeInt/GaugeDouble is simple enough, for Sum I am a bit worried to duplicate all the fields.

Sum is currently 5 fields and I suggest to remove 2. Only 3 fields will remain, 2 of which will be the same (duplicate) for SumInt64 and SumDouble. Not big deal unless we think more common fields will be added in the future.

Another option is to merge #172 and simply don't care about the in-memory extra 8 bytes per point (no effect on the marshal/unmarshal or network size).

That works too. Did you do any benchmarking to see if it can hint to a more performant approach? Other than performance I don't have a strong preference for one of the discussed options, we can use either, both have their own benefits and drawbacks.

I promise a follow-up immediately when this is merged to resolve this with benchmark numbers. For the moment the main change proposed here is to unify metadata about aggregation with the data so I want to move forward with that.

Filed #206 to track this.

opentelemetry/proto/metrics/v1/metrics.proto

jmacd · 2020-08-19T19:48:04Z

opentelemetry/proto/metrics/v1/metrics.proto

+
+  // Numerical value of the measurement that was recorded. Only one of these
+  // two fields is used for the data, based on MeasurementValueType.
+  double double_value = 3;


Similar to Tigran's question above, should we distinguish raw integer and raw double points?

This was just moved from the top of the file. But I tend to agree with #172 proposal you had, and just ignore 8bytes in memory overhead in order to simplify and not duplicate all the fields.

opentelemetry/proto/metrics/v1/metrics.proto

jkwatson

This seems reasonable to me. Just one tiny suggestion on a clarification on the excellent diagram.

opentelemetry/proto/metrics/v1/metrics.proto

This PR does not change any semantics, it just re-organize the protos: * Removes MetricDescriptor and adds fields to the Metric directly * Combines MetricDescriptor.Type with specific DataPoints. This has the following advantages: * Require only one allocation more than the initial version (v0.4.0). When using gogo proto. * In-memory representation is smaller than the initial version (v0.4.0). * Extensible: allows to add new data types in the Metric. * Groups together metadata and points about different aggregation types. Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

bogdandrutu requested review from a team August 18, 2020 23:57

bogdandrutu commented Aug 19, 2020

View reviewed changes

opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved

tigrannajaryan reviewed Aug 19, 2020

View reviewed changes

bogdandrutu force-pushed the mergetypeanddata branch from c97a3e9 to f91b85d Compare August 19, 2020 16:32

jmacd approved these changes Aug 19, 2020

View reviewed changes

bogdandrutu force-pushed the mergetypeanddata branch from f91b85d to 4439b45 Compare August 20, 2020 19:15

bogdandrutu mentioned this pull request Aug 20, 2020

Decide if for Gauge/Sum use measurement type or different types IntGauge/DoubleGauge #206

Closed

tigrannajaryan approved these changes Aug 20, 2020

View reviewed changes

jkwatson reviewed Aug 21, 2020

View reviewed changes

opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved

jkwatson approved these changes Aug 21, 2020

View reviewed changes

anuraaga approved these changes Aug 21, 2020

View reviewed changes

opentelemetry/proto/metrics/v1/metrics.proto Show resolved Hide resolved

bogdandrutu force-pushed the mergetypeanddata branch from 7c6cb32 to e9dc287 Compare August 21, 2020 16:30

bogdandrutu merged commit 8b80e3f into open-telemetry:master Aug 21, 2020

bogdandrutu deleted the mergetypeanddata branch August 21, 2020 16:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge descriptor type and data points into Data. #202

Merge descriptor type and data points into Data. #202

bogdandrutu commented Aug 18, 2020 •

edited

Loading

bogdandrutu commented Aug 19, 2020

tigrannajaryan Aug 19, 2020

bogdandrutu Aug 20, 2020

tigrannajaryan Aug 20, 2020

bogdandrutu Aug 20, 2020

bogdandrutu Aug 20, 2020

jmacd Aug 19, 2020

bogdandrutu Aug 20, 2020

jkwatson left a comment

Merge descriptor type and data points into Data. #202

Merge descriptor type and data points into Data. #202

Conversation

bogdandrutu commented Aug 18, 2020 • edited Loading

bogdandrutu commented Aug 19, 2020

tigrannajaryan Aug 19, 2020

Choose a reason for hiding this comment

bogdandrutu Aug 20, 2020

Choose a reason for hiding this comment

tigrannajaryan Aug 20, 2020

Choose a reason for hiding this comment

bogdandrutu Aug 20, 2020

Choose a reason for hiding this comment

bogdandrutu Aug 20, 2020

Choose a reason for hiding this comment

jmacd Aug 19, 2020

Choose a reason for hiding this comment

bogdandrutu Aug 20, 2020

Choose a reason for hiding this comment

jkwatson left a comment

Choose a reason for hiding this comment

bogdandrutu commented Aug 18, 2020 •

edited

Loading