-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/kafkareceiver] autocommit set false does not take effect when exporter failed #37136
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@ChrisYe2015 could you show the details of how your messages are published in Kafka ? Since you haven’t set an encoding, I believe you are using the default, which is the otlp_proto. If you could help me reproduce the issue, I’m interested in your case. I believe it might be more related to the Kafka configurations themselves rather than the collector. For example, have you tried retention configurations like KAFKA_LOG_RETENTION_HOURS ? |
@apolzek Thank you for your prompt reply. For KAFKA_LOG_RETENTION_HOURS already set, and I use the following architecture to collect traces:
kafka docker-compose.yml:
No other additional config. Please help to check my settings, Or have any other suggestions, thanks! |
I'd be taking a look into this over the weekend. |
@ChrisYe2015 I’m still analyzing the collector source code to confirm what I’m about to say, but regardless of whether autocommit is enabled or not, the messages are not lost. From what I understand, this configuration will only have an impact if the test involves a failure in the second OpenTelemetry Collector (the one consuming from Kafka). The scenario you expect, with an increase in lag in the Kafka consumer, didn’t happen in cases of failure with Jaeger. I believe this is for a couple of reasons. The first might be related to the offset control, where the collector knows up to which point the message has been successfully processed and exported. The second guess is that the collector has a buffer that holds these messages. I’m not an expert on the collector, but I’ll have more information soon. If anyone with experience comes by to contribute, that would be great; otherwise, please wait as I’m looking for evidence. My test is accessible here !! It’s important to note that the lag only occurs when there is a very high volume of messages per second. It’s not directly related to a failure in the jaeger exporter, at least based on my tests 🤔 |
Component(s)
receiver/kafka
What happened?
Description
I used kafka receiver and Jaeger exporter,and turned off autocommit in otelcol-config.yml.
When Jaeger is unavailable, I expect kafka messages not be consumed and the lag increase,but not working。
May I ask if my configuration is incorrect or if the current feature does not support Exporter failure and prevent submission of offset
Steps to Reproduce
1.Stop Jaeger
2.disable autocommit and restart otel-collector
2.Send message to kafka
Expected Result
Lag increase and continue to consume when Jaeger is ready
Actual Result
Lag = 0 and offset commited
Collector version
v0.116.0
Environment information
Environment
linux docker
OpenTelemetry Collector configuration
Log output
Additional context
No response
The text was updated successfully, but these errors were encountered: