Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The trace_id/span_id of exemplar would not change when using promethuesremotewrite exporter #30830

Closed
toughnoah opened this issue Jan 29, 2024 · 10 comments

Comments

@toughnoah
Copy link

Component(s)

exporter/prometheusremotewrite

What happened?

Description

when I use the spanmetrics connector, the trace_id/span_id of exemplar would not change if I use the promethuesremotewrite exporter directly write into backend prometheus/mimir

Steps to Reproduce

Simply start a demo to produce traces to otel collector

Expected Result

trace_id and span_id of the exemplar should change if I start a new trace

Actual Result

trace_id/span_id would always be the same, and the only way to is to restart otel collector
1706547360287
WX20240130-005511@2x
WX20240130-005539@2x

Collector version

v0.91.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

kubernetes 1.25

OpenTelemetry Collector configuration

connectors:
      spanmetrics:
        histogram:
          explicit:
            buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
        exemplars:
          enabled: true
          max_per_data_point: 1000
        metrics_flush_interval: 30s
    receivers:
      otlp:
        protocols:
          http:
            endpoint: :4317
          grpc:
            endpoint: :4318
    service:
      telemetry:
        logs:
          level: "debug"
      extensions: []
      pipelines:
        metrics:
          receivers: [otlp, spanmetrics]
          exporters: [prometheusremotewrite]
        traces:
          receivers: [otlp]
          exporters: [spanmetrics]
    exporters:
      prometheusremotewrite: # the PRW exporter, to ingest metrics to backend
        endpoint: http://mimir-distributed-nginx.grafana/api/v1/push
        remote_write_queue:
          enabled: false
        resource_to_telemetry_conversion:
          enabled: true

Log output

No response

Additional context

No response

@toughnoah toughnoah added bug Something isn't working needs triage New item requiring triage labels Jan 29, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

github-actions bot commented Feb 6, 2024

Pinging code owners for connector/spanmetrics: @portertech. See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

github-actions bot commented Apr 8, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ankitpatel96
Copy link
Contributor

Hi,
I have been unable to reproduce this issue locally. I've configured the collector similarly to you, and I'm running prometheus locally on docker with the default configuration (plus flags to enable exemplar storage and remote write). To generate traces I'm running telemetrygen traces --otlp-insecure --duration 5m --rate 4.

When I look at prometheus, each of the exemplars have different spans and trace ids. Do you have any more details on your setup, or can you try using something like telemetrygen to generate your traces to rule out your data?

@github-actions github-actions bot removed the Stale label May 4, 2024
@mx-psi mx-psi added waiting for author and removed needs triage New item requiring triage labels May 6, 2024
@toughnoah
Copy link
Author

@ankitpatel96 Hi, I am using v0.91.0 otel-collector, and I have a service that generates traces and span. I have fixed it locally because I debugged into to pkg/translator/prometheusremotewrite/helper.go getPromExemplars method just appends new exemplar to the slice, but promethues will always pick up the first exemplar of the slice.

@ankitpatel96
Copy link
Contributor

Thanks for the response! Can you clarify the problem? Is span metrics working as expected? Is the problem with prometheus or in the prometheusremotewrite exporter? I glanced through the prometheus codebase and at a first glance (I really could be wrong) prometheus does read through all the exemplars. https://github.com/prometheus/prometheus/blob/d699dc3c7706944aafa56682ede765398f925ef0/storage/remote/write_handler.go#L140-L147

@toughnoah
Copy link
Author

@ankitpatel96 Hi, it is the prometheusremotewrite exporter. Prometheus exporter works fine. Yes, you can see my screen shots that span metrics works fine but with prometheusremotewrite exporter, the mimir/prometheus can only picked up the first element in the exemplars slice no matter how many times I start new traces. I try to modify the source code, that only pickup the last exemplar. Then it get solved

@ankitpatel96
Copy link
Contributor

Have you filed a ticket with mimir? Sounds like a bug in their system rather than the prometheus remotewrite exporter?

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 10, 2024
Copy link
Contributor

github-actions bot commented Sep 8, 2024

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants