-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/prometheus] Expired metrics were not be deleted #17306
Comments
Can you put a logging exporter on both of the metrics pipelines? I'd like to localize the issue to either the |
I think @Aneurysm9 is right. Should the |
@Aneurysm9 sorry for the late reply, I updated the config with logging exporter and debug log level receivers:
# Dummy receiver that's never used, because a pipeline is required to have one.
otlp/spanmetrics:
protocols:
grpc:
endpoint: "localhost:12345"
otlp:
protocols:
grpc:
endpoint: "localhost:55677"
processors:
batch:
spanmetrics:
metrics_exporter: otlp/spanmetrics
latency_histogram_buckets: [10ms, 100ms]
dimensions:
- name: db.system
default: N/A
- name: db.name
default: N/A
- name: db.sql.table
default: N/A
- name: db.instance
default: N/A
dimensions_cache_size: 1000
aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"
exporters:
logging:
verbosity: basic
logging/1:
verbosity: normal
logging/2:
verbosity: normal
otlp/spanmetrics:
endpoint: "localhost:55677"
tls:
insecure: true
prometheus:
endpoint: "0.0.0.0:8889"
metric_expiration: 5s
service:
pipelines:
traces:
receivers: [otlp]
processors: [spanmetrics, batch]
exporters: [logging]
# The exporter name must match the metrics_exporter name.
# The receiver is just a dummy and never used; added to pass validation requiring at least one receiver in a pipeline.
metrics/spanmetrics:
receivers: [otlp/spanmetrics]
exporters: [otlp/spanmetrics, logging/1]
metrics:
receivers: [otlp]
exporters: [prometheus, logging/2]
telemetry:
logs:
level: debug When I repeated the before steps, I got logs like this:
|
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
We're also seeing this in some collectors where we do |
Pinging code owners for processor/spanmetrics: @albertteoh. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Pinging code owners for connector/spanmetrics: @albertteoh. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
@albertteoh @kovrus we are seeing this problem with The panel below shows how
|
Hi @mfilipe , the spanmetrics connector will continue to emit metrics at a configurable time interval (default every 15s) even when no new spans are sent to it, so I believe this is what's preventing the There are some knobs available to influence the number of metrics the spanmetrics connector emits which are The former ensures no more than this many metrics will be store in memory and emitted downstream. The latter speaks to the resolution of histograms, so a smaller number should reduced the number of metrics exported. |
tldr; |
It appears the Put another way, spanmetricsprocessor needs to expire metrics internally otherwise the downstream spanmetricsprocessor.exportMetrics -> ConsumeMetrics |
I believe this issue should be reopened, it seems it was closed due to inactivity but it still persists, I also tried to reach out on Slack some time ago but no luck unfortunately (https://cloud-native.slack.com/archives/CJFCJHG4Q/p1700844297561809) I'm facing same scenario as @mfilipe but none of the suggested solutions really satisfy my needs, as I simply want to stop exporting metrics for spans from services that no longer produce spans. I don't think configuring cache size or buckets will help me achieve it. I'd like to have a mechanism where such metrics are "expired" once no new spans are being produced. From my understanding the plan proposed by @nijave but it seems like this has not been implemented yet. Are there any objections or known blockers? |
You would need a feature request assuming it's related to what I thought it was. The spanmetricsprocessor doesn't take time into account and keeps publishing metrics subject to dimensions_cache_size (so it always keeps last ###) However, the spanmetricsprocessor should actually respect dimensions_cache_size now (#27083 ). Also worth noting spanmetricsprocessor is deprecated and replaced with spanmetricsconnector |
Actually I might have lied. |
We're also facing the case where metrics are not deleted, e.g. when a service is stopped and is not producing spans anymore. In that case the metric stops counting up but it is still exported. Everything gets (of course) to a clean state when the opentelemetry collector instance is restarted. To me this issue seems related to the issue #21101. Configuring I've used version 0.90.1 of opentelemetry-collector. |
Component(s)
exporter/prometheus
What happened?
Description
Hi, I am trying to use spanmetrics processor and prometheus exporter to transform spans to metrics. But I found some expired metrics seems to be appeared repeatedly when new metrics received. And also the memory usage continued to rise. So is this a bug of prometheus exporter?
Steps to Reproduce
examples:
when I posted span to collector, the prometheus exporter exported the metric like this
after 5 seconds, the metric will disappear. Then I posted another span to collector, the prometheus exporter exported two metrics included the expired one
Expected Result
Expired metrics will be deleted.
Actual Result
Expired metrics seems to be still stored in memory cache.
Collector version
2ada50f
Environment information
Environment
OS: MacOS 13.0.1
Compiler(if manually compiled): go 1.19.3
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: