Inherit instrumentation library name #1364

Tenaria · 2022-01-14T04:59:35Z

Description:
Adding the feature for generated metrics to inherit instrumentation library name from the span if specified so via config.

There is one caveat to the method I used which is that it'll duplicate the OTel metrics structure when declaring instrumentation library details and have each instrumentation library section twice (one for "calls_total" metrics and one for "latency" metrics) which then have the metrics / data points attached to them. E.g for one span with the same resource attributes and library name, the generated metric structure will be pseudocode like:
--> Metrics
.......--> Resource Metrics
.............--> Instrumentation Library Name A
....................--> Metric Calls Total
...........................--> Data Point Calls Total
.............--> Instrumentation Library Name A
....................--> Metric Latency
...........................--> Data Point Latency

I chose to do it this way because it's a "small-ish" trade-off with the benefit of not having to completely rework the code and the data structures use. Very open to suggestions if you can see a simpler way to do it or if you feel strongly we should do the larger change to have it not duplicate like described above.

Testing:

Tests added to check that instrumentation library name is correctly added when config is specified true
Tests added to check that default instrumentation library name ("spanmetricsprocessor") when config is specified to false
Existing tests modified to suit

Documentation:
README / Config Examples for usage

…nsume traces

…sian-forks/opentelemetry-collector-contrib into inherit-instrumentation-library-name

chenzhihao · 2022-01-19T02:17:18Z

LGTM, only a few minor change suggestions.

I'm thinking we should add an abstraction layer to replace map[resourceKey]map[instrLibKey]map[metricKey]val? The reason is we can see most of the change can be simplified if we have the abstraction layer for the index data structure. It will also provide the ability to follow the Open-Closed principle if we need to change the data structure in future.

It's not required to be done at this moment. But worth considering to be done when we push the stuff to OTel upstream later.

processor/spanmetricsprocessor/processor_test.go

processor/spanmetricsprocessor/processor.go

jamesmoessis · 2022-01-19T06:06:38Z

processor/spanmetricsprocessor/README.md

@@ -74,6 +74,8 @@ The following settings can be optionally configured:
 - `attach_span_and_trace_id` attaches span id and trace id as attributes on metrics generated from spans if set to `true`. If not provided,
  will use default value of `false`.

+  - `inherit_instrumentation_library_name`: defines whether the metrics generated from spans should inherit the same instrumentation library name as the span. If not provided, will use default value of `false` which will define the instrumentation library name as `spanmetricsprocessor` on metrics.


Why do we want to inherit the instrumentation library name of the span? Is it strictly necessary? Is it solely for the one attribute from the span being reflected on the metric?

In my view the instrumentation Library being spanmetricsprocessor is valid, because that's where the metric is generated. If someone really needs to know the original instrumentation library, they can look at the linked span.

The issue with setting everything as spanmetricsprocessor as the instrumentation library is that it can incorrectly aggregate. For example, in our use case, we have HTTP metrics sent from Service Proxy and HTTP metrics sent from the application. These will share the same attributes and as a result they'll all be aggregated together into one blob by the span metrics processor, hence incorrectly showing doubled up data. The only way to differentiate between these two is by checking the instrumentation library name.

I see, yeah it makes sense. Wouldn't they be aggregated in some other way though? Like I would think service.name would be different in the the case you mentioend

Nah, cos the service.name for both would be the application (not Service Proxy). Otherwise, all the Service Proxy metrics for all services would collapse into one giant metric if they had the same service name (and even if we set up another variable to mark the actual service name, this would have issues in the pipeline since we now can't rate limit/throttle per service since everything the service would be set to SP). Using the instrumentation library name seemed like the in-built way in OTel to differentiate, instead of spinning up our own solution. (We had a conversation about this as a team some time last year, I'm not sure if you were there)

Yeah, fair enough! sounds good

jamesmoessis · 2022-01-19T06:09:15Z

processor/spanmetricsprocessor/processor.go

-	latencyCount         map[resourceKey]map[metricKey]uint64
-	latencySum           map[resourceKey]map[metricKey]float64
-	latencyBucketCounts  map[resourceKey]map[metricKey][]uint64
+	latencyCount         map[resourceKey]map[instrLibKey]map[metricKey]uint64


I understand why you've done it this way, but it does make it more complex. It seems the more we add to this the more we are just replicating what the metrics SDK is designed to do, except in a makeshift way.

Probably not necessary for now, but I would like to investigate if it's possible to simply use an SDK instrument and aggregator to create these metrics rather than re-inventing the wheel.

Do you have any links to the stuff you mentioned, I'm not very familiar with it. Is there an in-built aggregator in the SDK? :curious:

We should consider removing the aggregation from the processor. Instead, it should only focus on extraction. See open-telemetry#403 (comment)

Yeah, the other idea is connect it up to other metrics processor so it can handle the funky stuff there. I don't have any links, it's just in my head. But there's definitely a better way, like you mentioned in the linked comment.

processor/spanmetricsprocessor/processor_test.go

Tenaria · 2022-01-20T21:22:34Z

@chenzhihao @jamesmoessis this should be ready for review again, I think I've addressed the PR comments 🥺

chenzhihao · 2022-01-20T23:46:54Z

LGTM

Tenaria added 7 commits January 13, 2022 12:09

starts adding config; adds todos in processor

cb6252e

adds config

6335480

work on adding instrumentation library name to aggregation step of co…

002affa

…nsume traces

Merge branch 'span-metrics-processor-collections' of github.com:atlas…

1e99f6a

…sian-forks/opentelemetry-collector-contrib into inherit-instrumentation-library-name

fixes type cast instrLibKey

bd90118

fii=nish logic; Fix tests; Add tests

cf04417

fixes merge conflicts

19350e0

Tenaria requested review from chenzhihao and jamesmoessis January 14, 2022 04:59

updates readme

4cb6065

chenzhihao reviewed Jan 19, 2022

View reviewed changes

jamesmoessis reviewed Jan 19, 2022

View reviewed changes

Tenaria added 2 commits January 20, 2022 14:42

fixes PR comments

3dfd091

remove unused function

fecee58

jamesmoessis approved these changes Jan 20, 2022

View reviewed changes

Tenaria merged commit d4863aa into collections-main Jan 21, 2022

Tenaria deleted the inherit-instrumentation-library-name branch January 21, 2022 00:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inherit instrumentation library name #1364

Inherit instrumentation library name #1364

Tenaria commented Jan 14, 2022 •

edited

Loading

chenzhihao commented Jan 19, 2022 •

edited

Loading

jamesmoessis Jan 19, 2022

Tenaria Jan 19, 2022

jamesmoessis Jan 20, 2022

Tenaria Jan 20, 2022 •

edited

Loading

jamesmoessis Jan 20, 2022

jamesmoessis Jan 19, 2022

Tenaria Jan 19, 2022

chenzhihao Jan 20, 2022

jamesmoessis Jan 20, 2022

Tenaria commented Jan 20, 2022

chenzhihao commented Jan 20, 2022

Inherit instrumentation library name #1364

Inherit instrumentation library name #1364

Conversation

Tenaria commented Jan 14, 2022 • edited Loading

chenzhihao commented Jan 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tenaria Jan 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tenaria commented Jan 20, 2022

chenzhihao commented Jan 20, 2022

Tenaria commented Jan 14, 2022 •

edited

Loading

chenzhihao commented Jan 19, 2022 •

edited

Loading

Tenaria Jan 20, 2022 •

edited

Loading