-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large strings decoded from latin1 and then encoded to utf8 has wrong size #22728
Labels
buffer
Issues and PRs related to the buffer subsystem.
confirmed-bug
Issues with confirmed bugs.
encoding
Issues and PRs related to the TextEncoder and TextDecoder APIs.
Comments
Quick test on 8.11.4 confirms the difference in behavior between Node.js 8 and 10. /cc @nodejs/intl @nodejs/buffer @srl295 |
vsemozhetbyt
added
the
encoding
Issues and PRs related to the TextEncoder and TextDecoder APIs.
label
Sep 6, 2018
This was fixed by #18216, I think. |
Yep, makes sense. That PR is still pending a backport to 8.x. Once the backport happens, this issue should be resolved. |
→ Backport is in #22731 |
addaleax
pushed a commit
to addaleax/node
that referenced
this issue
Oct 2, 2018
* Respect `encoding` argument when the string is externalized. * Copy the string when the write request can outlive the externalized string. This commit removes `StringBytes::GetExternalParts()` because it is fundamentally broken. Fixes: nodejs#18146 Fixes: nodejs#22728 PR-URL: nodejs#18216
BethGriggs
pushed a commit
that referenced
this issue
Oct 16, 2018
* Respect `encoding` argument when the string is externalized. * Copy the string when the write request can outlive the externalized string. This commit removes `StringBytes::GetExternalParts()` because it is fundamentally broken. Fixes: #18146 Fixes: #22728 Backport-PR-URL: #22731 PR-URL: #18216 Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
buffer
Issues and PRs related to the buffer subsystem.
confirmed-bug
Issues with confirmed bugs.
encoding
Issues and PRs related to the TextEncoder and TextDecoder APIs.
Decoding a latin1 buffer larger than about 1MB to a string and encoding that string into utf-8 gives a buffer with the same number of bytes as the latin1 input even though more are required for characters that use more space in utf-8.
This seems to work properly on v10.x but not v8.x or v9.x.
Code that demonstrates the problem:
The text was updated successfully, but these errors were encountered: