Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1181: Improved "Normalize to BibTeX name format" #1470

Merged
merged 15 commits into from
Sep 12, 2016

Conversation

bruehldev
Copy link
Contributor

@bruehldev bruehldev commented Jun 2, 2016

Fixed #1181: Improved "Normalize to BibTeX name format": Separated names with colons only work now

  • Change in CHANGELOG.md described
  • Tests created for changes
  • Screenshots added (for bigger UI changes)

@koppor
Copy link
Member

koppor commented Jun 2, 2016

It shows The command "./gradlew check integrationTest" exited with 1, which should not be caused by this PR. LGTM.

@stefan-kolb
Copy link
Member

The problem are always the GUI tests. It's annoying...

@@ -36,6 +36,28 @@ public String getKey() {

@Override
public String format(String value) {
// Fixes https://github.com/JabRef/jabref/issues/1181
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment. It's not relevant that this solved some issue.

@tobiasdiez
Copy link
Member

As I said in #1181, there should be some additional handling of author names with Jr part like von Last, Jr, First. Moreover, the code probably has to go into the AuthorList parser.

public void testSeperatedCommaNames() {
// Testing the issue "https://github.com/JabRef/jabref/issues/1181".
// Every second comma should become a semicolon
public void testSeperatedCommaNames_EverySecondCommaBecomesAnAnd() {
Copy link
Member

@stefan-kolb stefan-kolb Jun 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not describe the use case here? allowConcatenationOfAuthorsWithCommas or testConcatenationOfAuthorsWithCommas

@koppor
Copy link
Member

koppor commented Jun 2, 2016

I think (as outlined in #1181 (comment)), that Jr are seldom and that I need a thing working for 80% of the cases and not covering 100%. We just looked at AuthorListParse.getAuthor() and it seems to be very difficult to include the special treatment there. We would have to include hard coded strings for jrPart to be able to distinguish them from other strings.

The fix works for me and really improves my workflow.

@stefan-kolb
Copy link
Member

Do you often have authors separated by colons? I would have guessed this is also a very seldom case.

@tobiasdiez
Copy link
Member

In my opinion, the cleanup operations / save actions should respect the bib(la)tex standard before anything else. Otherwise they keep reformatting valid bibtex and thus become useless.

Sorry, but I just can't see an automatic way to correctly determine the authors if they are only separated by comma. Consider strings like
Surname, Jr, First, Surname2, First2
or Surname, First, First2 Surname2.

@stefan-kolb
Copy link
Member

Is a comma even allowed for separating authors in Bib(la)tex?

@koppor
Copy link
Member

koppor commented Jun 3, 2016

IMHO we discussed the separator at some other place that biblatex can be configured to use something else than and, but I do not find the source.

In bibtex, and really has to be used.

This tweak happens only if only commans and 2 or more commas and even number of commas

Thus, it will currently. destroy both of your examples. - We can include a checker if your special cases ( Jr) and parts with one word only (Surname, First at the second example) and not handle these.

I still think that you have these special cases, but that these are not the majority of the cases.

@tobiasdiez
Copy link
Member

tobiasdiez commented Jun 3, 2016

I'm fine with every change as long as standard bibtex names don't get destroyed. So in particular:

  • Last, First and Last2, First2 and Last3, First3
  • Last, First and Last2, First2 and Last3, First3 and First4 Last4 (this should be converted to Last4, First4)
  • Last, Jr, First and Last2, First2
  • Last and Last2, First2 and Last3, First3 and Last4, First4

Moreover, semi-correct names using semicolon should still be converted to the correct format (i.e. replace every and in the above examples with a semicolon in the input). Please add corresponding tests.

@@ -36,6 +36,25 @@ public String getKey() {

@Override
public String format(String value) {
// Handle case names in order lastname, firstname and separated by ","
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sourround the new block with

if (!value.containts(" and ")) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test also { as well as ;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For checking for Curly Braces there is a String Util Method which also has
a Unit Test. Maybe you can reuse that.

2016-06-03 14:13 GMT+02:00 Tobias Diez notifications@github.com:

In
src/main/java/net/sf/jabref/logic/formatter/bibtexfields/NormalizeNamesFormatter.java
#1470 (comment):

@@ -36,6 +36,25 @@ public String getKey() {

 @Override
 public String format(String value) {
  •    // Handle case names in order lastname, firstname and separated by ","
    

test also { as well as ;


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JabRef/jabref/pull/1470/files/efdced6ec231b82f0f8e82a32a57a2f0fb454fb7#r65697196,
or mute the thread
https://github.com/notifications/unsubscribe/AATi5PDxbzRee0GVpU2oQWEe35bpUgUMks5qIBpNgaJpZM4Is5D7
.

@koppor
Copy link
Member

koppor commented Jun 3, 2016

😮 - completely overseen that case. 🙈 - Thank you for clarifying!

@bruehldev Please include that test and fix the issue. I indicated the necessary change (check for value.containts(" and ")) at the appropriate place.

@@ -69,4 +105,15 @@ public String getExampleInput() {
return "Albert Einstein and Alan Turing";
}

private static boolean contains(final String[] array, final String[] searchTerms) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use arrays. Use Collections. Collections.contains is your friend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this unused method.

@matthiasgeiger
Copy link
Member

Please check whether your improvement also solves #1504 by adding the example as an additional test case.

Thanks!

@tschechlovdev tschechlovdev changed the title Fixed #1181: Improved "Normalize to BibTeX name format" [WIP]Fixed #1181: Improved "Normalize to BibTeX name format" Jun 27, 2016
@@ -24,6 +30,9 @@
*/
public class NormalizeNamesFormatter implements Formatter {

// Avoid partition where these values are contained
private final Collection<String> avoidTermsInLowerCase = Arrays.asList("jr", "sr", "jnr", "snr", "von", "zu", "van", "der");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "von", "zu", "van" also covered by test cases? My feeling is that the code might treat them wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this called "affixesInLowerCase"? - Why is "von", "zu", "van" and "der" an affix? I think, only jr, sr, jnr, and snr are affixes. - As far as I can guess, "van" are not affixes

When reading https://en.wikipedia.org/wiki/Suffix_(name), it seems that jr is a name suffix, isn't it?

@koppor koppor changed the title [WIP]Fixed #1181: Improved "Normalize to BibTeX name format" [WIP] Fixes #1181: Improved "Normalize to BibTeX name format" Jul 12, 2016
@Braunch
Copy link
Contributor

Braunch commented Jul 26, 2016

What is the status of this?

@Braunch Braunch added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Sep 3, 2016
@bruehldev
Copy link
Contributor Author

Rebased to upstream master. Should be ready to merge now

@tobiasdiez
Copy link
Member

As far as I can see the new code is still in the NormalizeNamesFormatter. Please also add the cases mentioned in #1470 (comment) as tests for the AuthorList.parse method, i.e. as assertEqual(expectedAuthor, Authorlist.parse(...)). Thanks.
Ready-for-review removed until this is done.

@tobiasdiez tobiasdiez removed the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Sep 7, 2016
@bruehldev
Copy link
Contributor Author

@tobiasdiez sorry for the wrong statement. I thought it was pushed, From now on i will double check my uploaded code.

I added the test cases for semicolons. The standart format is already tested by threeAuthorsSeperatedByAnd, testMultipleNameAffixes, testNormalizeAuthorList. I hope it's okay that i'm using last2,3,4 and first2,3,4 as names in the semicolon test.

@koppor koppor changed the title [WIP] Fixes #1181: Improved "Normalize to BibTeX name format" Fixes #1181: Improved "Normalize to BibTeX name format" Sep 12, 2016
@koppor koppor merged commit d9dc3a8 into JabRef:master Sep 12, 2016
zesaro pushed a commit to zesaro/jabref that referenced this pull request Nov 22, 2016
@tobiasdiez tobiasdiez mentioned this pull request Mar 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants