-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parser for German months (#3536) #3734
Add parser for German months (#3536) #3734
Conversation
CHANGELOG.md
Outdated
@@ -34,6 +34,7 @@ The new default removes the linked file from the entry instead of deleting the f | |||
- The group editing window can now also be called by double-clicking the group to be edited. [koppor#277](https://github.com/koppor/jabref/issues/277) | |||
- The magnifier icon at the search shows the [search mode](https://help.jabref.org/en/Search#search-modes) again. [#3535](https://github.com/JabRef/jabref/issues/3535) | |||
- We added a new cleanup operation that replaces ligatures with their expanded form. [#3613](https://github.com/JabRef/jabref/issues/3613) | |||
- We added the function to parse German month names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coressponding issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I forgot. Fixed it.
* @author Johannes Preßmar | ||
*/ | ||
private static Optional<Month> parseGermanShortMonth(String value) { | ||
String[] shortMonths = new DateFormatSymbols(Locale.GERMAN).getShortMonths(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use the new java 8 Date & Time Api
Construt a DateTimeFormatter.ofPattern("MMM", Locale.DE)
Use YearMonth.parse and then you can get a Numeric value from that
https://docs.oracle.com/javase/8/docs/api/java/time/YearMonth.html#parse-java.lang.CharSequence-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it. Is it better like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Looks better now
@@ -93,9 +95,31 @@ | |||
int number = Integer.parseInt(value); | |||
return Month.getMonthByNumber(number); | |||
} catch (NumberFormatException e) { | |||
return Optional.empty(); | |||
return parseGermanShortMonth(testString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better move the method above in the parse method, so the parseGermanMonth will be executed when the others returned empty. Just like it it is deone with the getMonthByShortName
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't know exactly what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will try to explain:
Move the method parseGermanShortMonth before the catch block. and after the getMonthByShortName:
month= Month.parseGerman....
if(month.isPresent)
return month;
The end result is the same, but it looks more logical in the whole parse method:
- We check if the value is empty
- We try to get the normal short month
- we try german short month
- last try: try into parse it as number
Otherwise from a quick look I would see and think: Okay, if the integer parsing fails it implies it is automatically a German month. And that might not be the case. In reality the integer parsing maybe failed because we entered 9999
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. I adjusted it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor issues regarding formatting an parsing of Maerz
.
* a String that represents a month in German form | ||
* @return the corresponding month instance, empty if input is not in German | ||
* form | ||
* @author Johannes Pre�mar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not put @author
tags. We acknowledge our authors in AUTHORS
. See https://github.com/JabRef/jabref/blob/master/CONTRIBUTING.md#author-credits. Please ensure that your git configuration user.name
and user.email
is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okey I understand, but I have done that for PE1718, but the worksheet said so for authorize. I deleted it and will put my name directly in the Pull Request.
*/ | ||
private static Optional<Month> parseGermanShortMonth(String value) { | ||
try { | ||
YearMonth yearMonth = YearMonth.parse("1969-" + value, DateTimeFormatter.ofPattern("yyyy-MMM", Locale.GERMAN)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indent seems to be wrong here. Use auto-format of your IDE. Ctrl+Alt+L in IntelliJ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of #3536 was to parse all available languages. Also Spanish, Greek, etc. Therefore, all available locales were fetched at https://github.com/JabRef/jabref/pull/3536/files#diff-3c5fdac594bc2e649536f5ecdcb481ccR122 and iterated on. Is this possible here, too - or does it clutter the code?
After more thinking, I would say, we do not iterate on all locales and keep German only, because only the literature list by http://www2.informatik.uni-stuttgart.de/zdi/buecherei/NCSTRL_listings/FAK/index.html causes these issues and I am not aware of any other systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use Eclipse, hopefully Ctrl + Shift + F will have the same result.
@@ -100,4 +100,36 @@ public void parseReturnsEmptyOptionalForInvalidInput() { | |||
public void parseReturnsEmptyOptionalForEmptyInput() { | |||
assertEquals(Optional.empty(), Month.parse("")); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjusted it.
public void parseCorrectlyByFullNameGerman() { | ||
assertEquals(Optional.of(Month.JANUARY), Month.parse("Januar")); | ||
assertEquals(Optional.of(Month.FEBRUARY), Month.parse("Februar")); | ||
assertEquals(Optional.of(Month.MARCH), Month.parse("M�rz")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: Please also add a test for Maerz
, which is returned by the Stuttgart System: http://www2.informatik.uni-stuttgart.de/zdi/buecherei/NCSTRL_listings/FAK/index.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also put in an if-statement for checking for Maerz
in the method parseGermanShortMonth
, because I was not sure if the Locale checks for this.
@@ -100,4 +100,36 @@ public void parseReturnsEmptyOptionalForInvalidInput() { | |||
public void parseReturnsEmptyOptionalForEmptyInput() { | |||
assertEquals(Optional.empty(), Month.parse("")); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjusted it.
@@ -88,33 +88,41 @@ | |||
if (testString.length() > 3) { | |||
testString = testString.substring(0, 3); | |||
} | |||
Optional<Month> month = Month.getMonthByShortName(testString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indent is wrong here. I have no clue, what is going wrong on your side. Do you use tabs instead of spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what went wrong. I used the auto formatter of Eclipse, so it could be the tabs. Now it looks better after replacing the tabs with spaces by hand. I should definetly change the IDE if this is a common problem.
*/ | ||
private static Optional<Month> parseGermanShortMonth(String value) { | ||
if ("Mae".equals(value)) { | ||
return Month.getMonthByNumber(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test whether the if is needed, uncomment it and execute the unit test.
If the test is green, you won't need it....
Regarding the code style format, did you execute gradlew eclipse? It should import the correct style settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is then red. So I leave the if-statement in there?
* form | ||
*/ | ||
private static Optional<Month> parseGermanShortMonth(String value) { | ||
if ("Mae".equals(value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it is "mae" in lowercase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation should use equalsIgnoreCase
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix indent.
|
||
@Test | ||
public void parseCorrectlyByShortNameGerman() { | ||
assertEquals(Optional.of(Month.JANUARY), Month.parse("Jan")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add tests for lowercase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The German short forms are in uppercase. So what should be tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, then just copy the test cases to a new method and then just call month. Parse with the lowercase name
And execute the tests then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests were added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small things remain.
@@ -85,9 +89,13 @@ | |||
testString = testString.substring(0, 3); | |||
} | |||
Optional<Month> month = Month.getMonthByShortName(testString); | |||
Optional<Month> monthGerman = Month.parseGermanShortMonth(testString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move that after line 95. the German month only has to be determined if there is no English month.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* form | ||
*/ | ||
private static Optional<Month> parseGermanShortMonth(String value) { | ||
if ("Mae".equals(value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix indent.
@@ -115,6 +117,9 @@ | |||
* form | |||
*/ | |||
private static Optional<Month> parseGermanShortMonth(String value) { | |||
// support for lowercase German month | |||
value = value.substring(0,1).toUpperCase() + value.substring(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this and the line above. Just use equalsIgnoreCase in line 123.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now no German months are accepted with lowercase, just mae
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked this case. I think, you have to change the if then to following
if (("Mae".equalsIgnoreCase(value)) || ("Maerz".equalsIgnoreCase(value))) {
Please also ensure that a test case for Maerz
is in there. If it was already, then I don't get why that code still worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dateformatter is case-sensitive by default. (Adapting) the following code should work:
DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern("yyyy-MMM")
.toFormatter(Locale.US);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion regarding "Maerz" and "Mae" ongoing.
@@ -115,6 +117,9 @@ | |||
* form | |||
*/ | |||
private static Optional<Month> parseGermanShortMonth(String value) { | |||
// support for lowercase German month | |||
value = value.substring(0,1).toUpperCase() + value.substring(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked this case. I think, you have to change the if then to following
if (("Mae".equalsIgnoreCase(value)) || ("Maerz".equalsIgnoreCase(value))) {
Please also ensure that a test case for Maerz
is in there. If it was already, then I don't get why that code still worked.
Please merge JabRefs' branch |
The tests are still failing. From looking at the code, I think that "Mär" is not parsed correctly. Could you please investigate this? Once all tests are passing on travis, I think we can merge this. |
I'll merge into |
I fixed the issues at #3742
|
Changed the month parse that it can also parse German months now.
This should solve issue #3536.
Have I missed something?
(For PE1718: Name: Johannes Preßmar)