Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix utf8 problems with specialchars #2485

Merged

Conversation

nPraml
Copy link
Contributor

@nPraml nPraml commented Jan 11, 2022

Hello @rbygrave ,

in our application we use some special UTF8 characters, e.g. ö,ü,ä, etc...
We had a problem with generating idx-checksums for migration scripts. If we generated the checksum in Eclipse, it returned an other checksum as maven in the cmd build.

I found this problem for encodings in Stackoverflow: https://stackoverflow.com/questions/12272698/why-does-maven-give-me-different-utf-8-characters-than-eclipse-test-run-in-ecli

I fixed this problem in MChecksum.java.

@rPraml checked how files/migrations will be read and written in other classes. He fixed the encodings to UTF8.

Can you check this PR and give us feedback?

Kind regards
Noemi

Copy link
Member

@rbygrave rbygrave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff. in MChecksumTest are we able to keep that original test and have the special chars one as a new additional test?

try (InputStream inputStream = UrlHelper.openNoCache(resource)) {
return readContent(new InputStreamReader(inputStream));
try (InputStream inputStream = UrlHelper.openNoCache(resource);
Reader reader = IOUtils.newReader(inputStream)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to review indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rbygrave Thanks for the comments! I have updated the PR

@@ -348,7 +347,9 @@ protected String readFile(String fileName) throws IOException {
if (!f.exists()) {
return null;
}
return readContent(new FileReader(f));
try (Reader reader = IOUtils.newReader(f)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixes also a possible resource leak

@@ -13,9 +15,8 @@
* Returns the checksum of the file. Agnostic of encoding and new line character.
*/
static int calculate(File file) {
try {
try (BufferedReader bufferedReader = IOUtils.newReader(file)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important @rbygrave This will introduce a breaking change, because we are now using UTF-8
If checksum was computed on a JVM where Charset.defaultCharset() is not UTF-8 (e.g. on a german windows) you may get a different checksum when the file contains special chars like "umlauts".

The risk is small, but it exists/should be mentioned in the changelog

Note: With JDK 18 https://openjdk.java.net/jeps/400, the charset will be set to UTF-8 by default, so you may get this problem anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine in that although we generate the index with checksums it isn't being used/read yet. The migration runner isn't actually using this checksum index file yet.

In that way we can think of this as non-breaking.

@rbygrave rbygrave added this to the 12.14.1 milestone Jan 12, 2022
@rbygrave rbygrave added the bug label Jan 12, 2022
@rbygrave
Copy link
Member

Great work @NSzemenyei !!

@rbygrave rbygrave merged commit c577913 into ebean-orm:master Jan 12, 2022
@rPraml rPraml deleted the fix-utf8-problems-with-specialchars branch February 25, 2022 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants