Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Type: application/json; charset=utf-8 can cause issues and may be incorrect #528

Open
jbrechtel opened this issue Feb 29, 2024 · 2 comments

Comments

@jbrechtel
Copy link

setRequestBodyJSON appends a "; charset=utf-8" to the end of the Content-Type header and this causes some libraries like

https://hc.apache.org/httpcomponents-core-4.4.x/current/httpcore/apidocs/org/apache/http/entity/ContentType.html

to fail to parse such HTTP requests.

It seems like the charset parameter is at best redundant (since JSON must be UTF-8) and at worst incorrect entirely according to the RFC - see the conversation here https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean

I suggest removing it entirely from setRequestBodyJSON and only set Content-Type to application/json. If that's OK then I'm happy to submit a PR.

@snoyberg
Copy link
Owner

snoyberg commented Mar 4, 2024

since JSON must be UTF-8

That's not historically true, per that SO discussion.

I haven't seen anything indicating that ; charset=utf-8 is in violation of any spec here. I'd be worried about breaking someone else's workflow, though that seems unlikely. Nonetheless, changing the default behavior because one implementation is having a problem isn't something I'd be happy about without seeing some clear RFCs specifying that ; charset=utf-8 isn't allowed here.

@jbrechtel
Copy link
Author

I read "Unicode" as "UTF-8" in this exchange and thought it was more clear-cut even historically -- oops.

Re: Breaking others, that's fair. I'd still suggest that an, at best, redundant parameter that is problematic for a specific (and widely used HTTP implementation) is best left off.

At any rate - we can (and have) obviously work around it by just replacing the Content-Type header with one lacking the problematic charset parameter.

Hopefully at least anyone else running into this problem can find this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants