Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Abbreviation for Escaped Ampersand #9288

Merged
merged 26 commits into from
Nov 2, 2022
Merged

Conversation

ANUu7312578
Copy link
Contributor

Fixes #8948.

This pull request attempts to fix issue #8948. The main #8948 issue has 3 sub-components:

  1. Abbreviations should always be without \& --> There is a CI check "necessary" at the abbrv repo
  2. JabRef should be able to handle both & and \& when abbrevating and unabbreviating
  3. When putting a result, \& should be used

The first sub-component, is currently being implemented by the pull request JabRef/abbrv.jabref.org#108. The changes highlighted in this pull request attempt to fix the second sub-component. It seems that the third component has already been implemented except that it has to be manually enabled by users.

A test which attempt to verify that unescaped ampersands in journal titles actually result in abbreviations has also been added.

Any feedback would be appreciated.

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, otherwise: Looks good to me.

The latexfreefield could not be used for checking the journal name? (org.jabref.model.entry.BibEntry#getResolvedFieldOrAliasLatexFree)

@@ -48,7 +49,8 @@ private static boolean isMatchedAbbreviated(String name, Abbreviation abbreviati
* Letters) or its abbreviated form (e.g. Phys. Rev. Lett.).
*/
public boolean isKnownName(String journalName) {
String journal = journalName.trim();
// Trims the String and also replaces any instances of "\&" with "&" for abbreviation search purposes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trims the String and also replaces any instances of "\&" with "&" is a literal explanation of the code and can be removed.

Suggested change
// Trims the String and also replaces any instances of "\&" with "&" for abbreviation search purposes
// The journal lists contains unescaped only, thus "\&" is replaced by "&" for abbreviation search purposes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

Comment on lines 80 to 81
// Trims the String and also replaces any instances of "\&" with "&" for abbreviation search purposes
String journal = input.trim().replaceAll(Matcher.quoteReplacement("\\&"), "&");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above 😇

@ANUu7312578
Copy link
Contributor Author

@koppor I must admit that I did not understand exactly what org.jabref.model.entry.BibEntry#getResolvedFieldOrAliasLatexFree was doing or what its parameters meant. So, I just used replaceAll because it seemed easier 😅. I have pushed a commit which attempts to use org.jabref.model.entry.BibEntry#getResolvedFieldOrAliasLatexFree. Could you provide some feedback? From my testing, it seems to accomplish the same job that replaceAll does. Thanks!

@Siedlerchr
Copy link
Member

@ANUu7312578 JabRef supports both BibTeX and biblatex. The old BibTeX format does not understand unicode and also special characters like & needs to be escaped using the special latex syntax.
The method GetResolvedFieldOrAlias does a few more things but the relevant part here is that it returns a latex command free value of the field (so basically the unicode)

@koppor
Copy link
Member

koppor commented Oct 26, 2022

I miss test cases for following BibTeX entry:

@article{test,
  journal = {ACS Applied Materials \\& Interfaces},
}

The result should be

@article{test,
  journal = {ACS Appl. Mater. Interfaces},
}

Second test:

@article{test,
  journal = {ACS Applied Materials & Interfaces},
}

The result should be

@article{test,
  journal = {ACS Appl. Mater. Interfaces},
}

Maybe, there are these tests, but better one more than no one.

@ANUu7312578
Copy link
Contributor Author

@koppor Hi! Sorry, but I'm not sure what you mean when you say that you miss test cases for the BibTeX entries. Do you mean that I should create tests which actually use a BibEntry object which is effectively

@article{test,
  journal = {ACS Applied Materials \\& Interfaces},
}

and try to abbreviate the BibEntry object journal title in order to test if it abbreviates into

@article{test,
  journal = {ACS Appl. Mater. Interfaces},
}

If so, I am not actually sure which function(s) to call to abbreviate the journal name in a BibEntry object. Would you be able to provide some pointers on this? It seems that org.jabref.gui.journals.UndoableAbbreviator is the only class which calls org.jabref.logic.journals.JournalAbbreviationRepository. However, this class seems to be more for the GUI and not for internal logic. If not, could you explain a bit more what you meant? Any clarification would be welcome!

Additionally, I am not sure I see the difference between the "first test" and the "second test". It seems that they both have the journal name as "ACS Applied Materials \\& Interfaces" and the abbreviated name as "ACS Appl. Mater. Interfaces". Also, I was wondering why the abbreviation of "ACS Applied Materials \\& Interfaces" would be "ACS Appl. Mater. Interfaces". Am I mistaken in thinking that the journal name should only abbreviate if the ampersand in the journal name is unescaped or only escaped once (e.g., "ACS Applied Materials \& Interfaces" -> "ACS Appl. Mater. Interfaces")?

Sorry for all the questions!

@Siedlerchr
Copy link
Member

Hi,
yes it's a bit confusing. But I think of a kind of integration test here:
You would use the logic classes:
UndoableUnabbreviator and UndoableAbbreviator

Create a new BibEntry and a new databasecontext

@koppor
Copy link
Member

koppor commented Oct 27, 2022

@koppor Hi! Sorry, but I'm not sure what you mean when you say that you miss test cases for the BibTeX entries. Do you mean that I should create tests which actually use a BibEntry object

Yes.

Additionally, I am not sure I see the difference between the "first test" and the "second test".

I updated the second test to have not any escapings.

Am I mistaken in thinking that the journal name should only abbreviate if the ampersand in the journal name is unescaped or only escaped once (e.g., "ACS Applied Materials & Interfaces" -> "ACS Appl. Mater. Interfaces")?

The thing is, it should be able to abbreviate in both cases. This should be shown by a test.

@ANUu7312578
Copy link
Contributor Author

Hi! I've attempted to write some tests according to the test cases specified by @koppor and using UndoableUnabbreviator and UndoableAbbreviator as suggested by @Siedlerchr. Currently, these test cases reside in org.jabref.logic.bibtex.BibEntryWriterTest. However, this causes the restrictUsagesInLogic test to fail because a CompoundEdit object is required to abbreviate a journal title. Could I get some feedback as to whether the tests I wrote are what was actually wanted and if I should move these tests to another test class in order to stop restrictUsagesInLogic from failing? Or, if there is another way to resolve restrictUsagesInLogic, any advice would be appreciated. Thanks!

@koppor
Copy link
Member

koppor commented Oct 28, 2022

The tests test the Journal Abbreviations. Thus, they should be in the test class for the Journal Abbreviator.

An Exception to the ArchUnit rules can be added in this case. There needs to be some refinement so that one can abbreviate without using a gui class.

I was thinking, there is a method
In: BibEntry
out: BibEntry abbreviated

Seems that this resides in gui only?

@Siedlerchr
Copy link
Member

@koppor The logic methods are in UndoableAbbreviator and UndoableUnabbreviator. They take an entry, a field and a databasecontext However, they require the swing compound edit for undoing

@ANUu7312578
Copy link
Contributor Author

I have moved the tests from the org.jabref.logic.bibtex.BibEntryWriterTest class to the org.jabref.logic.journals.JournalAbbreviationRepositoryTest class like @koppor suggested. Hopefully it is now in the appropriate test class.

Is there anything else which needs fixing? I am also not really sure how to add an exception to the ArchUnit rules 😅.

@Siedlerchr
Copy link
Member

@ANUu7312578 For the test I would adapt this as follows:
Then you should be able to annotate the test class with @AllowedToUseSwing

  @ArchTest
   public static void restrictUsagesInLogic(JavaClasses classes) {
       noClasses().that().resideInAPackage(PACKAGE_ORG_JABREF_LOGIC)
                   .and().areNotAnnotatedWith(AllowedToUseSwing.class)
                  .should().dependOnClassesThat().resideInAPackage(PACKAGE_JAVAX_SWING)
                  .orShould().dependOnClassesThat().haveFullyQualifiedName(CLASS_ORG_JABREF_GLOBALS)
                  .check(classes);
   }

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the tests.

They are currently too complicated. I made another proposal

Comment on lines 64 to 79
/**
* Returns the LaTeX free version of a journal (e.g., IEEE Design \& Test would be returned as IEEE Design & Test)
* i.e., the journal name should not contain any unnecessary LaTeX syntax such as escaped ampersands
*
* @param journal The journal name
* @return The LaTeX free version of the journal name
*/
private String getLatexFreeJournal(String journal) {
BibEntry entry = new BibEntry();
entry.setField(StandardField.JOURNAL, journal);
if (entry.getResolvedFieldOrAliasLatexFree(StandardField.JOURNAL, null).isPresent()) {
journal = entry.getResolvedFieldOrAliasLatexFree(StandardField.JOURNAL, null).get();
}
return journal;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not the intention - sorry for that!

Your old code with the replacement was better!

I thought, you were reading the BibTeX field content at some place. Here, you seem to read the journal name from somewhere.

CHANGELOG.md Outdated
@@ -74,6 +74,7 @@ Note that this project **does not** adhere to [Semantic Versioning](http://semve
- We fixed an issue where hitting enter on the search field within the preferences dialog closed the dialog. [koppor#630](https://github.com/koppor/jabref/issues/630)
- We fixed a typo within a connection error message. [koppor#625](https://github.com/koppor/jabref/issues/625)
- We fixed an issue where the 'close dialog' key binding was not closing the Preferences dialog. [#8888](https://github.com/jabref/jabref/issues/8888)
- We fixed an issue where journal abbreviations would not abbreviate journal titles with escaped ampersands (\\&) which are written in accordance with Latex format. [#8948](https://github.com/JabRef/jabref/issues/8948)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the last part, that information should be made available in the internet elsewhere 😃

Suggested change
- We fixed an issue where journal abbreviations would not abbreviate journal titles with escaped ampersands (\\&) which are written in accordance with Latex format. [#8948](https://github.com/JabRef/jabref/issues/8948)
- We fixed an issue where journal abbreviations would not abbreviate journal titles with escaped ampersands (\\&). [#8948](https://github.com/JabRef/jabref/issues/8948)

Comment on lines 186 to 196
// @formatter:off
String expectedAbbreviatedJournalEntry = """
@Article{,
journal = {ACS Appl. Mater. Interfaces},
}
""".replaceAll("\n", OS.NEWLINE);
// @formatter:on

undoableAbbreviator.abbreviate(bibDatabase, entryWithEscapedAmpersandInJournal, StandardField.JOURNAL, new CompoundEdit());
bibEntryWriter.write(entryWithEscapedAmpersandInJournal, bibWriter, BibDatabaseMode.BIBLATEX);
assertEquals(expectedAbbreviatedJournalEntry, stringWriter.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparison on a String bases should only be used for serialization tests.

In your case, do

Suggested change
// @formatter:off
String expectedAbbreviatedJournalEntry = """
@Article{,
journal = {ACS Appl. Mater. Interfaces},
}
""".replaceAll("\n", OS.NEWLINE);
// @formatter:on
undoableAbbreviator.abbreviate(bibDatabase, entryWithEscapedAmpersandInJournal, StandardField.JOURNAL, new CompoundEdit());
bibEntryWriter.write(entryWithEscapedAmpersandInJournal, bibWriter, BibDatabaseMode.BIBLATEX);
assertEquals(expectedAbbreviatedJournalEntry, stringWriter.toString());
undoableAbbreviator.abbreviate(bibDatabase, entryWithEscapedAmpersandInJournal, StandardField.JOURNAL, new CompoundEdit());
BibEntry expectedAbbreviatedJournalEntry = new BibEntry(StandardEntryType.Article)
.withField(StandardField.JOURNAL, "ACS Appl. Mater. Interfaces");
assertEquals(expectedAbbreviatedJournalEntry , entryWithEscapedAmpersandInJournal);

…sInLogic test changed to allow classes annotated with @AllowedToUseString. Test cases and escaped ampersand replacement in abbreviation changed.
@ANUu7312578
Copy link
Contributor Author

Thanks to @Siedlerchr for the advice on how to add an exception to the ArchUnit rules! Hopefully, I've now accomplished everything @koppor asked for. The restrictUsagesInLogic test should no longer be failing. Please let me know if there is anything else that I should do! 😃

@Siedlerchr
Copy link
Member

Thanks again for the quick follow up! From my side it looks already good from my side

@Siedlerchr Siedlerchr marked this pull request as ready for review November 1, 2022 18:58
@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Nov 1, 2022
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for keeping working on this.

I added some comments 😇. - Since they are more general hints than really required, I go forward with a merge 😅. You can do a follow-up PR if you have time ^^

@@ -48,7 +49,7 @@ private static boolean isMatchedAbbreviated(String name, Abbreviation abbreviati
* Letters) or its abbreviated form (e.g. Phys. Rev. Lett.).
*/
public boolean isKnownName(String journalName) {
String journal = journalName.trim();
String journal = journalName.trim().replaceAll(Matcher.quoteReplacement("\\&"), "&");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, there is the possibility to pre-compile the Regex. Was it with the Pattern class?

Comment on lines +152 to +159
assertEquals(new Abbreviation("ACS Applied Materials & Interfaces", "ACS Appl. Mater. Interfaces"), repository.get("ACS Applied Materials & Interfaces").get());
assertEquals(new Abbreviation("ACS Applied Materials & Interfaces", "ACS Appl. Mater. Interfaces"), repository.get("ACS Applied Materials \\& Interfaces").get());
assertEquals(new Abbreviation("Antioxidants & Redox Signaling", "Antioxid. Redox Signaling"), repository.get("Antioxidants & Redox Signaling").get());
assertEquals(new Abbreviation("Antioxidants & Redox Signaling", "Antioxid. Redox Signaling"), repository.get("Antioxidants \\& Redox Signaling").get());

repository.addCustomAbbreviation(new Abbreviation("Long & Name", "L. N.", "LN"));
assertEquals(new Abbreviation("Long & Name", "L. N.", "LN"), repository.get("Long & Name").get());
assertEquals(new Abbreviation("Long & Name", "L. N.", "LN"), repository.get("Long \\& Name").get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, we expect one assertation per test case. Thus, this method should be split into several ones. @ParamterizedTest is the the thign to use.

Comment on lines +168 to +169
BibEntry entryWithEscapedAmpersandInJournal = new BibEntry(StandardEntryType.Article);
entryWithEscapedAmpersandInJournal.setField(StandardField.JOURNAL, "ACS Applied Materials \\& Interfaces");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BibEntry entryWithEscapedAmpersandInJournal = new BibEntry(StandardEntryType.Article);
entryWithEscapedAmpersandInJournal.setField(StandardField.JOURNAL, "ACS Applied Materials \\& Interfaces");
BibEntry entryWithEscapedAmpersandInJournal = new BibEntry(StandardEntryType.Article)
.withField(StandardField.JOURNAL, "ACS Applied Materials \\& Interfaces");

@koppor koppor merged commit 7367ee5 into JabRef:main Nov 2, 2022
Siedlerchr added a commit to DavidOWade/jabref that referenced this pull request Nov 3, 2022
* upstream/main: (30 commits)
  New translations JabRef_en.properties (German) (JabRef#9336)
  Fix Abbreviation for Escaped Ampersand (JabRef#9288)
  Fix font size preference not updating in preference dialog 8386 (JabRef#9287)
  New Crowdin updates (JabRef#9333)
  Refresh example styles
  Squashed 'buildres/csl/csl-styles/' changes from 4eee79a..13fd98e
  Bump unirest-java from 3.13.11 to 3.13.12 (JabRef#9330)
  Bump checkstyle from 10.3.4 to 10.4 (JabRef#9331)
  Bump gittools/actions from 0.9.14 to 0.9.15 (JabRef#9332)
  Group context menu presents relevant options depending on number of subgroups (JabRef#9286)
  Removed BibTeX file type and included HTML and Markdown types (JabRef#9318)
  Fix issue: Auto-linking files with safe character replacements JabRef#9267 (JabRef#9316)
  Fix for issue 8806: Button highlights doesn't respect rounded corners (JabRef#9320)
  New Crowdin updates (JabRef#9324)
  Update CHANGELOG.md
  Try to relocate listener binding (JabRef#9238)
  Changed the messages after importing unlinked local files to past passive tense. (JabRef#9308)
  Changed the color of found text from red to high contrast  (JabRef#9315)
  New Crowdin updates (JabRef#9317)
  Add skips for MacOS X build (JabRef#9305)
  ...
Siedlerchr added a commit to yanyanliu1400/jabref that referenced this pull request Nov 7, 2022
* upstream/main: (61 commits)
  Bump jackson-datatype-jsr310 from 2.13.4 to 2.14.0 (JabRef#9357)
  Bump jackson-dataformat-yaml from 2.13.4 to 2.14.0 (JabRef#9356)
  Bump tika-core from 2.5.0 to 2.6.0 (JabRef#9358)
  Refine guideline "Set up a local workspace" (JabRef#9355)
  Adds missing "requires" statement (JabRef#9354)
  Bind "Find unlinked files" default directory to the current library (JabRef#9290)
  Rename "Remote" to "Tele" (JabRef#9270)
  Fixed display of file field adjusting for the dark theme (JabRef#9343)
  Update sorting of entries in maintable by special fields immediately (JabRef#9338)
  New translations JabRef_en.properties (German) (JabRef#9336)
  Fix Abbreviation for Escaped Ampersand (JabRef#9288)
  Fix font size preference not updating in preference dialog 8386 (JabRef#9287)
  New Crowdin updates (JabRef#9333)
  Refresh example styles
  Squashed 'buildres/csl/csl-styles/' changes from 4eee79a..13fd98e
  Bump unirest-java from 3.13.11 to 3.13.12 (JabRef#9330)
  Bump checkstyle from 10.3.4 to 10.4 (JabRef#9331)
  Bump gittools/actions from 0.9.14 to 0.9.15 (JabRef#9332)
  Group context menu presents relevant options depending on number of subgroups (JabRef#9286)
  Removed BibTeX file type and included HTML and Markdown types (JabRef#9318)
  ...
Siedlerchr added a commit that referenced this pull request Nov 8, 2022
* upstream/main:
  Add date format and respective test case (#9310)
  Bump jackson-datatype-jsr310 from 2.13.4 to 2.14.0 (#9357)
  Bump jackson-dataformat-yaml from 2.13.4 to 2.14.0 (#9356)
  Bump tika-core from 2.5.0 to 2.6.0 (#9358)
  Refine guideline "Set up a local workspace" (#9355)
  Adds missing "requires" statement (#9354)
  Bind "Find unlinked files" default directory to the current library (#9290)
  Rename "Remote" to "Tele" (#9270)
  Fixed display of file field adjusting for the dark theme (#9343)
  Update sorting of entries in maintable by special fields immediately (#9338)
  New translations JabRef_en.properties (German) (#9336)
  Fix Abbreviation for Escaped Ampersand (#9288)
  Fix font size preference not updating in preference dialog 8386 (#9287)

# Conflicts:
#	CHANGELOG.md
Siedlerchr added a commit to shafinkamal/jabref that referenced this pull request Nov 11, 2022
* upstream/main: (55 commits)
  Update guidelines-for-setting-up-a-local-workspace.md
  Add link to good first issues (and fix link)
  Refine guidelines (and fix .md file linking)
  Extend ArXiv fetcher results by using data from related DOIs (JabRef#9170)
  New Crowdin updates (JabRef#9366)
  Fix token
  New translations JabRef_en.properties (Chinese Simplified) (JabRef#9364)
  Fix date field change causes an uncaught exception (JabRef#9365)
  Add date format and respective test case (JabRef#9310)
  Bump jackson-datatype-jsr310 from 2.13.4 to 2.14.0 (JabRef#9357)
  Bump jackson-dataformat-yaml from 2.13.4 to 2.14.0 (JabRef#9356)
  Bump tika-core from 2.5.0 to 2.6.0 (JabRef#9358)
  Refine guideline "Set up a local workspace" (JabRef#9355)
  Adds missing "requires" statement (JabRef#9354)
  Bind "Find unlinked files" default directory to the current library (JabRef#9290)
  Rename "Remote" to "Tele" (JabRef#9270)
  Fixed display of file field adjusting for the dark theme (JabRef#9343)
  Update sorting of entries in maintable by special fields immediately (JabRef#9338)
  New translations JabRef_en.properties (German) (JabRef#9336)
  Fix Abbreviation for Escaped Ampersand (JabRef#9288)
  ...

# Conflicts:
#	CHANGELOG.md
#	src/main/java/org/jabref/preferences/AppearancePreferences.java
#	src/main/java/org/jabref/preferences/JabRefPreferences.java
#	src/test/java/org/jabref/gui/theme/ThemeManagerTest.java
Siedlerchr added a commit that referenced this pull request Nov 12, 2022
* upstream/main: (40 commits)
  Update guidelines-for-setting-up-a-local-workspace.md
  Add link to good first issues (and fix link)
  Refine guidelines (and fix .md file linking)
  Extend ArXiv fetcher results by using data from related DOIs (#9170)
  New Crowdin updates (#9366)
  Fix token
  New translations JabRef_en.properties (Chinese Simplified) (#9364)
  Fix date field change causes an uncaught exception (#9365)
  Add date format and respective test case (#9310)
  Bump jackson-datatype-jsr310 from 2.13.4 to 2.14.0 (#9357)
  Bump jackson-dataformat-yaml from 2.13.4 to 2.14.0 (#9356)
  Bump tika-core from 2.5.0 to 2.6.0 (#9358)
  Refine guideline "Set up a local workspace" (#9355)
  Adds missing "requires" statement (#9354)
  Bind "Find unlinked files" default directory to the current library (#9290)
  Rename "Remote" to "Tele" (#9270)
  Fixed display of file field adjusting for the dark theme (#9343)
  Update sorting of entries in maintable by special fields immediately (#9338)
  New translations JabRef_en.properties (German) (#9336)
  Fix Abbreviation for Escaped Ampersand (#9288)
  ...

# Conflicts:
#	CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bib(la)tex status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Equality check for Journal abbreviations
5 participants