Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Use a lossy conversion in ResponseReader::text_utf8 #88

Merged
merged 1 commit into from
Feb 6, 2021
Merged

RFC: Use a lossy conversion in ResponseReader::text_utf8 #88

merged 1 commit into from
Feb 6, 2021

Conversation

adamreichold
Copy link
Contributor

@adamreichold adamreichold commented Feb 6, 2021

Fixes #87

@Shnatsel
Copy link

Shnatsel commented Feb 6, 2021

I admit I have no idea what the correct behavior is here. It's possible that refusing to process invalid UTF-8 is the right thing do to, and the lossy conversion used by e.g. ureq is wrong. Are there any standard specifying this?

@sbstp
Copy link
Owner

sbstp commented Feb 6, 2021

It seems like reqwest also uses a lossy conversion https://docs.rs/reqwest/0.11.0/reqwest/struct.Response.html#method.text

@adamreichold
Copy link
Contributor Author

We also decided for lossy conversion when using encoding_rs as encoding_rs_io defaults to this behaviour. So purely for reasons of consistency, the lossy versions seems preferable.

@sbstp sbstp merged commit aa7a74d into sbstp:master Feb 6, 2021
@adamreichold adamreichold deleted the lossy-utf8-conversion branch February 6, 2021 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fetching google.com without 'charsets' feature fails; works in other clients
3 participants