-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector batch bytes limits are based on in-memory sizing of events #10020
Comments
I guess I ran into this issue this week. We had vector 0.10 working since the release and wanted to upgrade to the latest vector version. During testing nothing seamed odd, but in production our firehose delivery stream got throttled. We were making 8000 requests per second instead of 16 to kinesis. We rolled the changes back and I had a look though the different source code versions of the kinesis firehose sink. The last one with working batching was before the great overhaul in v0.18. |
@jszwedko, any updates on this issue? Is it part of the roadmap by any chance? |
Unfortunately not yet. This'll be a difficult thing to fix. |
@tobz pointed out that our current batching mechanism uses the in-memory representation of the events to determine their size which will not match their serialized size. This could result in Vector sending batches that are either above or below the configured batch size. Typically we think it will be below given that the in-memory size of the event has been observed to be generally much greater than the serialized size of the event so Vector will send suboptimal batches. However, if it does generate a batch greater than the configured batch size this could result in failed requests if the batch size was configured to match some sink API limit.
References:
RequestBuilder
/RequestMetadata
to streamline splitting/building #12857gcp_cloud_storage
sink ignoring batch.max_bytes #14426The text was updated successfully, but these errors were encountered: