You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
Our deployment topology: vector-agent -> kafka -> vector-aggregator.
After updating vector-agents from 0.32.2 to 0.33.0, I noticed a significant decrease in the number of incoming messages to Kafka. For agents with a large flow of logs (~1000 logs/s), the sending rate to Kafka dropped from 1000 to 120 per second, and disk buffer usage also began to grow.
I also noticed that the kafka_queue_messages metric fluctuated around 100-700, after update it is exactly 64.
Maybe this is related to #18634? (I assume only because of the similar number)
Thanks for pointing this out. I believe you are correct that this behavior is related to #18634.
The limit was previously hard-coded to 100k, which was too high for many users and resulted in Vector OOM'ing when the upstream applied back pressure. I updated to the limit to 64 under the understanding that FutureProducer::send() returned a result once the message was enqueued on the underlying producer's queue. However, that is not the case. Rather, as you are encountering, it returns when the message is sent.
Rather than hard-coding this value to 64 or 100k, it should probably be set equal to queue.buffering.max.messages.
Just to confirm that I am experiencing the same issue after having upgraded to 0.33.0; it basically killed my whole logging pipeline.
A downgrade restored the previous behaviour.
A note for the community
Problem
Our deployment topology:
vector-agent -> kafka -> vector-aggregator
.After updating vector-agents from 0.32.2 to 0.33.0, I noticed a significant decrease in the number of incoming messages to Kafka. For agents with a large flow of logs (~1000 logs/s), the sending rate to Kafka dropped from 1000 to 120 per second, and disk buffer usage also began to grow.
I also noticed that the
kafka_queue_messages
metric fluctuated around 100-700, after update it is exactly 64.Maybe this is related to #18634? (I assume only because of the similar number)
Configuration
Version
0.33.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: