[@opentelemetry/instrumentation-http] Configure custom metric attributes #4107

klippx · 2023-09-01T09:04:38Z

Is your feature request related to a problem? Please describe.

We are deploying canaries and rely heavily on metrics and dashboards to measure how they compare to stable version currently out in production. There is no way to distinguish node http auto instrumentation metrics however, they all get mangled up in the same metric labels.

I have created a discussion but no one was able to help me.

Describe the solution you'd like

I want a clear and intuitive API to configure custom metric attributes that user may inject outside of span.

Describe alternatives you've considered

Inject the attributes from prometheus exporter

This feel wrong for two reasons:

Everything else is programmatically configurable so metrics should be extendable too
A pod might change from "canary" to "stable" during its lifecycle and we only know about this in the app so the attribute needs to come from the app, it will be too hard to do in low level prometheus exporter

Change pod name

By changing pod name we could get exported_job to become different but that would be bad because they are all the same app, just variants of the same app. So all the breaking changes in dashboards will be too wonky to worry about.

The text was updated successfully, but these errors were encountered:

pichlermarc · 2023-09-01T09:46:16Z

Hi, thanks for opening this issue 🙂

Could be that I'm misunderstanding, but I'm thinking https://opentelemetry.io/docs/instrumentation/js/resources/ might be what you're looking for. Is it possible to distinguish canary and stable by the time the container starts? If yes, then it should be possible to use a resource for it. 🙂

Or does that incur the same problems as with adding it to the Prometheus exporter?

klippx · 2023-09-04T09:53:03Z

Thanks for the review, I realise that as a maintainer it is sensitive to add new APIs/features willy nilly. So hopefully if something is added it is done in a reasonable manner that is in line with the existing library ergonomics.

The closest to solving this with resource that we have managed is:

  const resource = new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: traceConfiguration.serviceName,
    [SemanticResourceAttributes.SERVICE_INSTANCE_ID]:
      process.env.POD_NAME ?? traceConfiguration.serviceName,
    [SemanticResourceAttributes.SERVICE_VERSION]: 'v666', // hardcoded example
    [SemanticResourceAttributes.SERVICE_NAMESPACE]: 'canary', // hardcoded example
    ...attributesFromEnv,
  });

  // ...
  const sdk = new opentelemetry.NodeSDK({
    metricReader,
    resource,
    instrumentations: [ getNodeAutoInstrumentations({ ...  }) ],
    // ...
  });

  // Configure meter provider
  sdk.configureMeterProvider({
    views: histogramNames.map(
      (instrumentName) =>
        new View({
          instrumentName,
          aggregation: new ExplicitBucketHistogramAggregation([
            50, 100, 250, 500, 1000, 2500, 5000, 10000, 20000,
          ]),
        }),
    ),
  });

  sdk.start();

The metrics recorded then look like this:

fed_graph_http_client_duration_bucket{
  exported_instance="product-enrichment-dgs", 
  exported_job="canary/product-enrichment-dgs", 
  http_flavor="1.1", 
  http_method="GET", 
  http_status_code="200", 
  instance="otel-collector:8083", 
  job="opentelemetry-collector", 
  le="+Inf", 
  net_peer_name="cce-internal.euwest1.staging.xxx.care", 
  net_peer_port="443"
}

So it seems we got "canary" in namespace, which could work (although as a hack) if this was possible to determine at boot.

However we see a number of issues with this:

It feels like this isn't a namespace. We would rather have a separate attribute called "role" that we can control.

If I have more attributes such as service version, do we have to "hack" it in to namespace to get it working?

Why wasn't other attributes such as SERVICE_VERSION added as a tag, in the example that was hardcoded to "v666" ?

We are happy they aren't though, because that would explode the dimensions of the metrics. As a user we appreciate to be in control. But maybe we want version to distinguish different canary versions from each other? We should be able to tweak it.

We are using Argo Rollouts, which means a pod can be promoted from "canary" to "stable" at runtime. So it is not determined by the time container starts.

We would have to "reconfigure" tracing on a hook such as in pseudo code:

// How to detect this? Can this even be reconfigured after sdk.start() was run? 
// Or do we have to shutdown and recreate it completely?
onPromition() => {
  sdk.configureTracerProvider(
    {
      resource, // resource was updated with namespace "stable"
    }, spanProcessor,
  );
}

We think that a way to extending attributes as a configuration on instrumentation level is in line with other hooks that are available in node auto instrumentation, question is if we nailed the API usage/ergonomics.

pichlermarc · 2023-09-05T14:07:01Z

Hmm, you're right; it does not really fit the namespace. I have to say that Prometheus is not really my area of expertise, but IIRC resources are usually exported via the target_info metric, which contains all resource attributes as labels, then one should join with target_info. I've found this blog post that shows how to do this. (Note: In the post linked above, there's also the recommendation to use an OTel Collector to copy the labels from the Resource to the telemetry by using an OTel collector and its transform processor, that would also be an option but would require deploying another component)

If I understand correctly, this should bring the concept of resource attributes into Prometheus, and then it is possible to split by resource attributes, so it should work with any resource attribute instead.

Of course, if the value of the resource attribute is not determined by the time the container starts, then that's challenging because I think we don't have a concept for that in OpenTelemetry at the moment. As such I think the whole topic should be addressed on a specification level rather than on individual instrumentation level, as most likely this would not only concern the telemetry emitted from the http instrumentation, but all other instrumentations (across all languages) as well. I'm aware of open-telemetry/oteps#207, but I think this proposal may require some hacking to make it work for this use-case. Then there's also #4045, which prototypes "global attributes" (only for traces and logs at the moment), which sounds more like what would be necessary here. 🤔

Thinking outside what's available in the app, I'm wondering if collector-contrib's kubernetes attributes processor would be able to apply a stable/canary resource attribute based on a k8s label? 🤔

dyladan · 2023-09-12T11:58:32Z

I have created a #4088 but no one was able to help me.

Have to admit I always forget the discussions feature exists

github-actions · 2023-11-13T06:34:40Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

klippx added the feature-request label Sep 1, 2023

klippx mentioned this issue Sep 1, 2023

feat(instrumentation-http): Add API for adding custom metric attributes #4108

Closed

7 tasks

github-actions bot added the stale label Nov 13, 2023

klippx closed this as completed Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[@opentelemetry/instrumentation-http] Configure custom metric attributes #4107

[@opentelemetry/instrumentation-http] Configure custom metric attributes #4107

klippx commented Sep 1, 2023

pichlermarc commented Sep 1, 2023

klippx commented Sep 4, 2023 •

edited

Loading

pichlermarc commented Sep 5, 2023 •

edited

Loading

dyladan commented Sep 12, 2023

github-actions bot commented Nov 13, 2023

[@opentelemetry/instrumentation-http] Configure custom metric attributes #4107

[@opentelemetry/instrumentation-http] Configure custom metric attributes #4107

Comments

klippx commented Sep 1, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Inject the attributes from prometheus exporter

Change pod name

pichlermarc commented Sep 1, 2023

klippx commented Sep 4, 2023 • edited Loading

pichlermarc commented Sep 5, 2023 • edited Loading

dyladan commented Sep 12, 2023

github-actions bot commented Nov 13, 2023

klippx commented Sep 4, 2023 •

edited

Loading

pichlermarc commented Sep 5, 2023 •

edited

Loading