-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge descriptor type and data points into Data. #202
Merge descriptor type and data points into Data. #202
Conversation
@tigrannajaryan in the Attributes you decided to use oneof instead of the list of all possibilities, I did the same here because most likely we will have a list of more than 5 items as possibilities for the data type, and I remember that you said is ok. |
oneof type { | ||
// TODO: Determine if encoding all possible values in a uint64 bitset | ||
// improves performance significantly and propose that if that is the case. | ||
oneof data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are paying the cost of oneof
here I would suggest to split Gauge
into 2 separate messages, one per data type and add both to this oneof
, i.e.:
oneof data {
GaugeIn64 gauge_int = 4;
GaugeDouble gauge_double = 5;
...
}
Same for Sum
.
By doing so we will eliminate the need to store the empty unused slice for each Gauge and each Sum. It will reduce memory usage.
You can likely also eliminate MeasurementValueType. It is redundant since that information will be encoded in oneof
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts here:
- MeasurementValueType cannot be completely removed because still needed in places like Histogram, unless we double that type as well. There may be other aggregations like that that will not have to expose different types based on the measurement value type (e.g. Summary that will most likely be re-added).
- I do agree that having a GaugeInt/GaugeDouble is simple enough, for Sum I am a bit worried to duplicate all the fields.
In general case we will have >5 data points per metrics (I remember this from a previous backend that I used), so indeed we may argue that splitting into int/double data points may save some memory for the cost of maintenance of duplicate code.
Another option is to merge #172 and simply don't care about the in-memory extra 8 bytes per point (no effect on the marshal/unmarshal or network size).
Any ideas?
My preference would be to simplify the model and merge #172 and keep the current MeasurementValueType as part of the Sum/Gauge/Histogram (not in the metric because it may not apply to some aggregation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MeasurementValueType cannot be completely removed
Correct. What I wanted to say is to remove the measurement_value_type
field from Gauge/GaugeInt, not remove the MeasurementValueType enum definition.
I do agree that having a GaugeInt/GaugeDouble is simple enough, for Sum I am a bit worried to duplicate all the fields.
Sum is currently 5 fields and I suggest to remove 2. Only 3 fields will remain, 2 of which will be the same (duplicate) for SumInt64 and SumDouble. Not big deal unless we think more common fields will be added in the future.
Another option is to merge #172 and simply don't care about the in-memory extra 8 bytes per point (no effect on the marshal/unmarshal or network size).
That works too. Did you do any benchmarking to see if it can hint to a more performant approach? Other than performance I don't have a strong preference for one of the discussed options, we can use either, both have their own benefits and drawbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I promise a follow-up immediately when this is merged to resolve this with benchmark numbers. For the moment the main change proposed here is to unify metadata about aggregation with the data so I want to move forward with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #206 to track this.
c97a3e9
to
f91b85d
Compare
|
||
// Numerical value of the measurement that was recorded. Only one of these | ||
// two fields is used for the data, based on MeasurementValueType. | ||
double double_value = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to Tigran's question above, should we distinguish raw integer and raw double points?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just moved from the top of the file. But I tend to agree with #172 proposal you had, and just ignore 8bytes in memory overhead in order to simplify and not duplicate all the fields.
f91b85d
to
4439b45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me. Just one tiny suggestion on a clarification on the excellent diagram.
This PR does not change any semantics, it just re-organize the protos: * Removes MetricDescriptor and adds fields to the Metric directly * Combines MetricDescriptor.Type with specific DataPoints. This has the following advantages: * Require only one allocation more than the initial version (v0.4.0). When using gogo proto. * In-memory representation is smaller than the initial version (v0.4.0). * Extensible: allows to add new data types in the Metric. * Groups together metadata and points about different aggregation types. Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
7c6cb32
to
e9dc287
Compare
This PR does not change any semantics, it just re-organize the protos:
This has the following advantages:
Fixes #189
Signed-off-by: Bogdan Drutu bogdandrutu@gmail.com