-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use charset from Content-Type header #22769
Comments
Discussed in Fix it Friday, the plan forward is to:
Ultimately this plan allows us to provide notice to users, hopefully learn the encodings that are being used other than unicode, and allow users a way to continue working while we decide if there is anything we can do about the other encodings. |
Not only it ignores the charset, but if we e.g. use header of
It took me a while to realize the warning was caused by the additional |
@Pyppe can you open an issue for the deprecation warning with the |
we might want to combine this with #72969 We already have a way to declare allowed parameters for given media type, but nothing is validated. elasticsearch/libs/x-content/src/main/java/org/elasticsearch/xcontent/XContentType.java Line 154 in 20c9f75
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
#64406 (which went into v8.0.0) seems to have added the ability to parse a charset, but accepts only utf-8. |
In #22691 (comment), I added a comment which points out that our code currently ignores the
charset
parameter of theContent-Type
header and that this is something we should look into. Looking at the javadocs ofJsonFactory
to see how different charsets are handled:Unfortunately not all clients adhere to the unicode only encodings as I have seen some send data as ISO-8859-1. I think we should consider parsing the charset from the content-type when available and handling appropriately (failing if we cannot support, convert, create parser differently etc.).
The text was updated successfully, but these errors were encountered: