-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch partial bulk failures #140
Comments
I spent some time on this as part of #139, and I think to do it right will require some rework of the util |
I like that and agree To add more context, this is definitely a later version feature and I don't think we should get cute with it. The other issue is that the response body can be very large depending on how much data was flushed, which can also cause performance and memory issues. I think a good first step is to check the root level |
Noting that a user ran into this again today: https://discord.com/channels/742820443487993987/746070591097798688/775478035738525707 Also noting that @fanatid attempted something like @lukesteensen mentioned with retrying partial failures in the #2755 sink, though it looks like we'll be reverting and reintroducing that handling later. @binarylogic I'd tag this as a |
Yeah, there are a few reasons I marked it as
Given the above, I'm wondering if partial retries are the best path forward as opposed to a dead-letter sink. There will be circumstances where partial retries will never succeed. |
Those are fair reasons, there are certainly cases where partial retries will never succeed and it would be good to have dead-letter support. However, in the case mentioned in discord, the none of the inserts in the request were successful, and would never succeed on a retry, and yet we were continuing to process events without note. Depending on the source the events are coming from, this could require some work on the user's part to replay messages once they noticed that none were making it into ES. At the least, I think we should be logging and reporting a metric for failed inserts into ES. |
It seems like that was dropped along the way because it isn't in master: EDIT never mind, I see it was just moved 😄 I'll try this out as it seemed like the user did not see any errors. |
Yeah, and clearly it would be nice to have a test for this if there's a way. |
Indeed, I was mistaken, apologies for the goose chase. I do see errors:
|
We were trying to using Vector as a server which will read data from Kakfa and push into elasticsearch, but right now vector is not support mapping conflicts/error handling. We dont want to loose errored out events. There should be config which will allow to configure DLQ which will allow to manipulate errored out events with the help of transforms and reprocess them. |
If partial retries are too hard to implement for now I would like to have an option for just retrying the full request (which is completely safe if setting the _id field) if that would help to get this implemented more quickly. |
Is there any progress here? |
Not yet, unfortunately, but it is on our radar. |
@jszwedko what is the current location of this issue on your radar? :) |
This may be something we tackle in Q4. We'll likely start with the suggested approach of just retrying the whole payload at first and do something more sophisticated in the future to only retry failed events. |
I would like to think about partial failures when ingesting data into Elasticsearch and if we even want to handle this scenario. Options include:
This is not urgent but I wanted to get it up for discussion and posterity.
The text was updated successfully, but these errors were encountered: