-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/elasticsearchexporter] Push failures not reported in metrics #32302
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Pinging code owners for exporter/elasticsearch: @JaredTan95 @ycombinator. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
nice catch,it's indeed a bug |
As of now I am quite a beginner in Go and OTEL code alike, though I'm trying to get to the bottom of it to fix it. I assume those cases on elasticsearch_bulk.go should return the error to bubble it up. |
@jleloup You're on the right path here. The trick is going to be figuring out how to make |
Sadly I won't have much time to work on this issue. |
So we tried an approach to fix this using a WaitGroup though we basically get context deadline exceeded at every occurrences. The thing is it seem we have a conflicting situation here were on one hand OTEL expects exceptions to be returned synchronously for each calls of Is there a way to get a handle on the core exporter metric itself so we could update it directly in BulkIndexer |
Well actually I just saw PR #32359 and it seems it would change the game quite a lot, especially going for flushing data at each PushLogData() indeed. |
First, does this issue still need work? It looks pretty old. I've never contributed to an open source project before (besides fixing typos in documentation), but I've read this repository's contributing.md. Is there anything I still need to do before someone can assign the issue to me? |
AFAK this issue still stands, we are recurringly having mapping issues with no impact on the metric. Last time I checked, a refactoring was ongoing on the underlying lib used to bulk items in Elasticsearch. ,now that this has been merged, I assume it should allow for flushing item on-demand in the exporter, thus fetching a number of potential failed events during the bulk action and reporting it to OTEL. Though I don't have time allocated to this right now, is there someone that could have a look at it ? |
Hey, I can given an attempt to this issue. I am new to OTel collector but able to understand the issue. |
This PR #32632 address the actual problem. |
Sure @carsonip. Let me know once the PR is merged. I can help testing this, as I am able to replicate the issue. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Any updates? This would be really useful for alerting. |
I tested with commit b14f1d4 and managed to reproduce this issue with a mapping conflict (HTTP 4xx), even with The reason for the difference between 4xx and 5xx behavior is that failed documents with HTTP 4xx (except 429) are considered non-retriable, including permission errors and mapping conflicts. They are not considered an "error" of a flush, and are not propagated upstream to avoid unnecessary retries. |
Component(s)
exporter/elasticsearch
What happened?
Description
I think the Elasticsearch exporter does not report failures properly in the
otelcol_exporter_send_failed_log_records
metrics.I am testing the behaviour of the elasticsearch exporter in the context of various failures. Eg.:
During these tests I can see error being reported in logs, eg for a mapping conflict: see log output for an example.
Though I can't see such failures on the
otelcol_exporter_send_failed_log_records
:I saw the very same behavior with another type of error: missing Elasticsearch permissions to create indices, which prevent the exporter from creating a new index when pushing if said index does not exists.
Steps to Reproduce
Trigger an Elasticsearch push failure:
As an example: map Body field as a long while pushing a typical log record from any receiver such as filelog.
Expected Result
otelcol_exporter_send_failed_log_records
counter is increased according to the failed log records.Actual Result
otelcol_exporter_send_failed_log_records
does not increase.Collector version
v0.97.0
Environment information
Environment
Kubernetes 1.28
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: