-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[awss3 input] unescape characters in s3 file names #38012
Comments
The error happens because the poller turns the original key
Into
The Poller performs this change at: beats/x-pack/filebeat/input/awss3/s3.go Lines 211 to 212 in aba91a4
Quick test using the Go Playground: This only happens when using the S3 input in polling mode; the S3 input in SQS mode can successfully process the same object. As a workaround, I recommend using the S3 input in SQS mode. |
It seems the list objects API doesn't escape the keys as the S3 notification did. So we don't need to unescape it. For example, here's how the list objects API returns the following two objects in S3. S3 objects: S3 list objects response: The |
@kaiyan-sheng, @aspacca, please let me know what you think. If the |
@zmoog Thanks for looking into it! Yes I think we can remove the |
Great. I'll draft a PR and run a couple of tests then. @aspacca, let me know what you think 🙇 |
@zmoog, yes, you're totally on spot! The bug was already reported #33998 and I came to the same conclusion As mentioned in my comment on that issue:
so, as @kaiyan-sheng also mentioned, we should remove |
@zmoog , sorry, please note that #18370 is the old version of the input. it was refactored as we know, this is the place where We should not remove this call. But rather this one: https://github.com/elastic/beats/blob/main/x-pack/filebeat/input/awss3/s3.go#L212 |
We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: elastic#18370 [^2]: elastic#38012 (comment)
We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: elastic#18370 [^2]: elastic#38012 (comment)
We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: elastic#18370 [^2]: elastic#38012 (comment)
We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: elastic#18370 [^2]: elastic#38012 (comment)
…de (#38125) * Remove url.QueryUnescape() We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: #18370 [^2]: #38012 (comment) --------- Co-authored-by: Andrea Spacca <andrea.spacca@elastic.co>
…de (#38125) * Remove url.QueryUnescape() We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: #18370 [^2]: #38012 (comment) --------- Co-authored-by: Andrea Spacca <andrea.spacca@elastic.co> (cherry picked from commit 5f1e656)
…s-s3 input in polling mode (#38165) * [AWS] [S3] Remove url.QueryUnescape() from aws-s3 input in polling mode (#38125) We introduced [^1] the `url.QueryUnescape()` function to unescape object keys from S3 notification in SQS messages. However, the object keys in the S3 list object responses do not require [^2] unescape. We must remove the unescape to avoid unintended changes to the S3 object key. [^1]: #18370 [^2]: #38012 (comment) --------- Co-authored-by: Andrea Spacca <andrea.spacca@elastic.co>
At some point there was introduced this fix #18370 to unescape characters in s3 file names.
Since then it seems that the implementation has changed and now there is similar error:
Filname:
2024-02-08T08:35:00+00:00.json
this error seems to be coming from the GetObject call
The text was updated successfully, but these errors were encountered: