Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parser for German months (#3536) #3734

Merged
merged 15 commits into from
Mar 9, 2018
Merged

Add parser for German months (#3536) #3734

merged 15 commits into from
Mar 9, 2018

Conversation

Eiswindyeti
Copy link
Contributor

@Eiswindyeti Eiswindyeti commented Feb 17, 2018

Changed the month parse that it can also parse German months now.
This should solve issue #3536.
Have I missed something?

(For PE1718: Name: Johannes Preßmar)


  • Change in CHANGELOG.md described
  • Tests created for changes
  • Screenshots added (for bigger UI changes)
  • Manually tested changed features in running JabRef
  • Check documentation status (Issue created for outdated help page at help.jabref.org?)

CHANGELOG.md Outdated
@@ -34,6 +34,7 @@ The new default removes the linked file from the entry instead of deleting the f
- The group editing window can now also be called by double-clicking the group to be edited. [koppor#277](https://github.com/koppor/jabref/issues/277)
- The magnifier icon at the search shows the [search mode](https://help.jabref.org/en/Search#search-modes) again. [#3535](https://github.com/JabRef/jabref/issues/3535)
- We added a new cleanup operation that replaces ligatures with their expanded form. [#3613](https://github.com/JabRef/jabref/issues/3613)
- We added the function to parse German month names.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coressponding issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forgot. Fixed it.

* @author Johannes Preßmar
*/
private static Optional<Month> parseGermanShortMonth(String value) {
String[] shortMonths = new DateFormatSymbols(Locale.GERMAN).getShortMonths();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use the new java 8 Date & Time Api

Construt a DateTimeFormatter.ofPattern("MMM", Locale.DE)
Use YearMonth.parse and then you can get a Numeric value from that
https://docs.oracle.com/javase/8/docs/api/java/time/YearMonth.html#parse-java.lang.CharSequence-

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it. Is it better like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Looks better now

@@ -93,9 +95,31 @@
int number = Integer.parseInt(value);
return Month.getMonthByNumber(number);
} catch (NumberFormatException e) {
return Optional.empty();
return parseGermanShortMonth(testString);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better move the method above in the parse method, so the parseGermanMonth will be executed when the others returned empty. Just like it it is deone with the getMonthByShortName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't know exactly what you mean.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will try to explain:
Move the method parseGermanShortMonth before the catch block. and after the getMonthByShortName:

 month= Month.parseGerman....
if(month.isPresent)
return month;

The end result is the same, but it looks more logical in the whole parse method:

  1. We check if the value is empty
  2. We try to get the normal short month
  3. we try german short month
  4. last try: try into parse it as number

Otherwise from a quick look I would see and think: Okay, if the integer parsing fails it implies it is automatically a German month. And that might not be the case. In reality the integer parsing maybe failed because we entered 9999

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation. I adjusted it.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues regarding formatting an parsing of Maerz.

* a String that represents a month in German form
* @return the corresponding month instance, empty if input is not in German
* form
* @author Johannes Pre�mar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not put @author tags. We acknowledge our authors in AUTHORS. See https://github.com/JabRef/jabref/blob/master/CONTRIBUTING.md#author-credits. Please ensure that your git configuration user.name and user.email is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okey I understand, but I have done that for PE1718, but the worksheet said so for authorize. I deleted it and will put my name directly in the Pull Request.

*/
private static Optional<Month> parseGermanShortMonth(String value) {
try {
YearMonth yearMonth = YearMonth.parse("1969-" + value, DateTimeFormatter.ofPattern("yyyy-MMM", Locale.GERMAN));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indent seems to be wrong here. Use auto-format of your IDE. Ctrl+Alt+L in IntelliJ.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of #3536 was to parse all available languages. Also Spanish, Greek, etc. Therefore, all available locales were fetched at https://github.com/JabRef/jabref/pull/3536/files#diff-3c5fdac594bc2e649536f5ecdcb481ccR122 and iterated on. Is this possible here, too - or does it clutter the code?

After more thinking, I would say, we do not iterate on all locales and keep German only, because only the literature list by http://www2.informatik.uni-stuttgart.de/zdi/buecherei/NCSTRL_listings/FAK/index.html causes these issues and I am not aware of any other systems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use Eclipse, hopefully Ctrl + Shift + F will have the same result.

@@ -100,4 +100,36 @@ public void parseReturnsEmptyOptionalForInvalidInput() {
public void parseReturnsEmptyOptionalForEmptyInput() {
assertEquals(Optional.empty(), Month.parse(""));
}

@Test
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted it.

public void parseCorrectlyByFullNameGerman() {
assertEquals(Optional.of(Month.JANUARY), Month.parse("Januar"));
assertEquals(Optional.of(Month.FEBRUARY), Month.parse("Februar"));
assertEquals(Optional.of(Month.MARCH), Month.parse("M�rz"));
Copy link
Member

@koppor koppor Feb 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: Please also add a test for Maerz, which is returned by the Stuttgart System: http://www2.informatik.uni-stuttgart.de/zdi/buecherei/NCSTRL_listings/FAK/index.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also put in an if-statement for checking for Maerz in the method parseGermanShortMonth, because I was not sure if the Locale checks for this.

@koppor koppor added the PE1718 label Feb 18, 2018
@@ -100,4 +100,36 @@ public void parseReturnsEmptyOptionalForInvalidInput() {
public void parseReturnsEmptyOptionalForEmptyInput() {
assertEquals(Optional.empty(), Month.parse(""));
}

@Test
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted it.

@@ -88,33 +88,41 @@
if (testString.length() > 3) {
testString = testString.substring(0, 3);
}
Optional<Month> month = Month.getMonthByShortName(testString);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indent is wrong here. I have no clue, what is going wrong on your side. Do you use tabs instead of spaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what went wrong. I used the auto formatter of Eclipse, so it could be the tabs. Now it looks better after replacing the tabs with spaces by hand. I should definetly change the IDE if this is a common problem.

*/
private static Optional<Month> parseGermanShortMonth(String value) {
if ("Mae".equals(value)) {
return Month.getMonthByNumber(3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test whether the if is needed, uncomment it and execute the unit test.
If the test is green, you won't need it....

Regarding the code style format, did you execute gradlew eclipse? It should import the correct style settings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is then red. So I leave the if-statement in there?

* form
*/
private static Optional<Month> parseGermanShortMonth(String value) {
if ("Mae".equals(value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it is "mae" in lowercase?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation should use equalsIgnoreCase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix indent.


@Test
public void parseCorrectlyByShortNameGerman() {
assertEquals(Optional.of(Month.JANUARY), Month.parse("Jan"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add tests for lowercase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The German short forms are in uppercase. So what should be tested?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, then just copy the test cases to a new method and then just call month. Parse with the lowercase name
And execute the tests then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests were added.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small things remain.

@@ -85,9 +89,13 @@
testString = testString.substring(0, 3);
}
Optional<Month> month = Month.getMonthByShortName(testString);
Optional<Month> monthGerman = Month.parseGermanShortMonth(testString);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move that after line 95. the German month only has to be determined if there is no English month.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* form
*/
private static Optional<Month> parseGermanShortMonth(String value) {
if ("Mae".equals(value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix indent.

@@ -115,6 +117,9 @@
* form
*/
private static Optional<Month> parseGermanShortMonth(String value) {
// support for lowercase German month
value = value.substring(0,1).toUpperCase() + value.substring(1);
Copy link
Member

@koppor koppor Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this and the line above. Just use equalsIgnoreCase in line 123.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now no German months are accepted with lowercase, just mae.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overlooked this case. I think, you have to change the if then to following

if (("Mae".equalsIgnoreCase(value)) || ("Maerz".equalsIgnoreCase(value))) {

Please also ensure that a test case for Maerz is in there. If it was already, then I don't get why that code still worked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dateformatter is case-sensitive by default. (Adapting) the following code should work:

DateTimeFormatter formatter = new DateTimeFormatterBuilder()
    .parseCaseInsensitive()
    .appendPattern("yyyy-MMM")
    .toFormatter(Locale.US);

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion regarding "Maerz" and "Mae" ongoing.

@@ -115,6 +117,9 @@
* form
*/
private static Optional<Month> parseGermanShortMonth(String value) {
// support for lowercase German month
value = value.substring(0,1).toUpperCase() + value.substring(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overlooked this case. I think, you have to change the if then to following

if (("Mae".equalsIgnoreCase(value)) || ("Maerz".equalsIgnoreCase(value))) {

Please also ensure that a test case for Maerz is in there. If it was already, then I don't get why that code still worked.

@koppor
Copy link
Member

koppor commented Feb 20, 2018

Please merge JabRefs' branch master (should be available at upstream/master) to resolve conflicts in CHANGELOG.md

@lenhard
Copy link
Member

lenhard commented Feb 21, 2018

The tests are still failing. From looking at the code, I think that "Mär" is not parsed correctly. Could you please investigate this?

Once all tests are passing on travis, I think we can merge this.

@Eiswindyeti Eiswindyeti deleted the add-parser-for-German-months-issue-3536 branch February 21, 2018 13:05
@Eiswindyeti Eiswindyeti restored the add-parser-for-German-months-issue-3536 branch February 21, 2018 13:07
@Eiswindyeti Eiswindyeti deleted the add-parser-for-German-months-issue-3536 branch February 21, 2018 13:07
@Eiswindyeti Eiswindyeti restored the add-parser-for-German-months-issue-3536 branch February 21, 2018 13:10
@Eiswindyeti Eiswindyeti reopened this Feb 21, 2018
@koppor koppor changed the base branch from master to wip March 9, 2018 18:57
@koppor koppor merged commit 39f495d into JabRef:wip Mar 9, 2018
@koppor
Copy link
Member

koppor commented Mar 9, 2018

I'll merge into wip, finish the PR and merge it into master.

@koppor koppor removed the PE1718 label Mar 9, 2018
@koppor
Copy link
Member

koppor commented Mar 9, 2018

I fixed the issues at #3742

  1. The java files had the wrong encoding (UTF8 was not used)
  2. The implementation for "mär" and "märz" was missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants