Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review all Encoding usage for BOM compatibility #1027

Open
1 task done
paulirwin opened this issue Nov 17, 2024 · 0 comments
Open
1 task done

Review all Encoding usage for BOM compatibility #1027

paulirwin opened this issue Nov 17, 2024 · 0 comments
Labels
is:task A chore to be done

Comments

@paulirwin
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Task description

Java's StandardCharsets.UTF_8 does not write a Byte-Order Mark (BOM), while .NET's System.Text.Encoding.UTF8 does include a BOM by default. We have ensured that the IOUtils.CHARSET_UTF_8 does not include a BOM to match Java, and as part of #1018 we've added an internal Support class to allow for using StandardCharsets.UTF_8, but we need to review all usage of System.Text.Encoding.UTF8 to determine if it should be replaced with StandardCharsets.UTF_8 or IOUtils.CHARSET_UTF_8 (whatever best matches the corresponding Java Lucene code) to avoid BOM issues.

@paulirwin paulirwin added the is:task A chore to be done label Nov 17, 2024
@paulirwin paulirwin added this to the 4.8.0-beta00018 milestone Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is:task A chore to be done
Projects
None yet
Development

No branches or pull requests

1 participant