-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for issue 5850: Journal abbreviations in UTF-8 not recognized #7639
Merged
Merged
Changes from 23 commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
2caa8e0
fix issue #5850 for encoding problem
MrGhabi 26d5100
add a blank line for build.gradle
MrGhabi aec8447
initial as main branch for build.gradle
MrGhabi 304adc0
initial as main branch for build.gradle
MrGhabi 8a8df28
add the change of fix information of issue 5850
MrGhabi 590940a
Fix check style
MrGhabi ee1cac7
Update CHANGELOG.md
MrGhabi c6f0cc2
Add the utf8 check for biblatex and ascii check for bibtex
MrGhabi cc099d7
Merge remote-tracking branch 'origin/fix-for-issue-5850' into fix-for…
MrGhabi a18a3af
add the new localization string the l10 files
MrGhabi fe69305
fix error
MrGhabi 673cc42
add the statement only in en.properties
MrGhabi 7e04a98
Merge remote-tracking branch 'origin/fix-for-issue-5850' into fix-for…
MrGhabi f3bf4ac
revert changes
MrGhabi 083e3ea
Update JabRef_da.properties
MrGhabi b1b5999
Update JabRef_ru.properties
MrGhabi 9e94837
Update build.gradle
MrGhabi e07e530
Update JabRef_fa.properties
MrGhabi b1a4f58
Update JabRef_no.properties
MrGhabi 85d2198
Update JabRef_pl.properties
MrGhabi 7e44819
Update JabRef_pt.properties
MrGhabi a81d2ec
Update JabRef_vi.properties
MrGhabi d980120
Update JabRef_zh_TW.properties
MrGhabi d7b1917
reset the default charset
MrGhabi cec382e
Merge remote-tracking branch 'origin/fix-for-issue-5850' into fix-for…
MrGhabi 02cc61e
reset the default charset
MrGhabi a4aff23
add the javaDoc of UTF8Checker
MrGhabi e8e02a9
add the javaDoc of UTF8CheckerTest and IntegrityCheckTest
MrGhabi 5092817
Remove the unwieldy Junit tests
MrGhabi 7bfe74a
Merge branch 'main' into fix-for-issue-5850
Siedlerchr File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
package org.jabref.logic.integrity; | ||
|
||
import java.nio.ByteBuffer; | ||
import java.nio.charset.CharacterCodingException; | ||
import java.nio.charset.Charset; | ||
import java.nio.charset.CharsetDecoder; | ||
import java.nio.charset.StandardCharsets; | ||
import java.util.ArrayList; | ||
import java.util.List; | ||
import java.util.Map; | ||
|
||
import org.jabref.logic.l10n.Localization; | ||
import org.jabref.model.entry.BibEntry; | ||
import org.jabref.model.entry.field.Field; | ||
|
||
public class UTF8Checker implements EntryChecker { | ||
|
||
/** | ||
* Detect any non UTF-8 encoded field | ||
*/ | ||
@Override | ||
public List<IntegrityMessage> check(BibEntry entry) { | ||
List<IntegrityMessage> results = new ArrayList<>(); | ||
Charset charset = Charset.forName(System.getProperty("file.encoding")); | ||
for (Map.Entry<Field, String> field : entry.getFieldMap().entrySet()) { | ||
boolean utfOnly = UTF8EncodingChecker(field.getValue().getBytes(charset)); | ||
if (!utfOnly) { | ||
results.add(new IntegrityMessage(Localization.lang("Non-UTF-8 encoded field found"), entry, | ||
field.getKey())); | ||
} | ||
} | ||
return results; | ||
} | ||
|
||
public static boolean UTF8EncodingChecker(byte[] data) { | ||
CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder(); | ||
try { | ||
decoder.decode(ByteBuffer.wrap(data)); | ||
} catch (CharacterCodingException ex) { | ||
return false; | ||
} | ||
return true; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
38 changes: 38 additions & 0 deletions
38
src/test/java/org/jabref/logic/integrity/UTF8CheckerTest.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
package org.jabref.logic.integrity; | ||
|
||
import java.util.Collections; | ||
import java.util.List; | ||
|
||
import org.jabref.model.entry.BibEntry; | ||
import org.jabref.model.entry.field.StandardField; | ||
|
||
import org.junit.jupiter.api.Test; | ||
|
||
import static org.junit.jupiter.api.Assertions.assertEquals; | ||
|
||
public class UTF8CheckerTest { | ||
|
||
private final BibEntry entry = new BibEntry(); | ||
|
||
@Test | ||
void fieldAcceptsUTF8() { | ||
UTF8Checker checker = new UTF8Checker(); | ||
entry.setField(StandardField.TITLE, "Only ascii characters!'@12"); | ||
assertEquals(Collections.emptyList(), checker.check(entry)); | ||
} | ||
|
||
@Test | ||
void fieldDoesNotAcceptUmlauts() { | ||
System.getProperties().put("file.encoding", "GBK"); | ||
UTF8Checker checker = new UTF8Checker(); | ||
String NonUTF8 = ""; | ||
try { | ||
NonUTF8 = new String("你好,这条语句使用GBK字符集".getBytes(), "GBK"); | ||
} catch (Exception e) { | ||
e.printStackTrace(); | ||
} | ||
entry.setField(StandardField.MONTH, NonUTF8); | ||
assertEquals(List.of(new IntegrityMessage("Non-UTF-8 encoded field found", entry, StandardField.MONTH)), checker.check(entry)); | ||
} | ||
|
||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simply remove that catch here and add throws Exception to the test method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!