Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add DOI matching with not full coincidence. #6756

Closed
wants to merge 5 commits into from

Conversation

KunAndrew
Copy link
Contributor

@KunAndrew KunAndrew commented Aug 11, 2020

Add DOI matching with not full coincidence.
"Improvement of Duplicate Detection #6707"
Fixes #6707

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

@KunAndrew KunAndrew force-pushed the fix-for-issue-6707 branch 2 times, most recently from c8bea4d to 6c8b7d7 Compare August 11, 2020 16:39
@@ -234,6 +234,12 @@ public String getNormalized() {
return doi;
}

public boolean isCompareNotExact(DOI o2) {
String s1 = this.doi.replaceAll("[^\\w]", "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the "replace" here necessary? Maybe it would be worthwhile to add it to the other cleanup that's already happing in the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe it is. I decided to do in this way in order to divide logics of creature, comparison and inaccurate comparison of DOI. This is associated with task context. The DOI standart is not strictly defined. Some special characters can serve as indicators or as special symbols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KunAndrew Can you add a test for this?

We agree that the DOI standard is vague at some places. However, in the concrete case, only white spaces are replaced.

Whitespace in aribtrary locations lead to a non-found doi.

image

Proposed solution: Can you put that cleanup into the constructor the next two weeks? 😇

@Siedlerchr Siedlerchr added the status: changes required Pull requests that are not yet complete label Aug 15, 2020
small improvements
@Siedlerchr
Copy link
Member

@KunAndrew How is the status here? Is this ready for review? It woud be really cool to have this, it's a massive improvement!

Moving cleanup's logic to DOI constructor.Add results of generate-authors.sh.
@KunAndrew KunAndrew marked this pull request as ready for review September 8, 2020 19:40
@KunAndrew
Copy link
Contributor Author

Sorry for the long wait.

@KunAndrew
Copy link
Contributor Author

Please check the work of scripts/generate-authors.sh. I distrust the result.

@Siedlerchr
Copy link
Member

Thanks for the update.
Normally you should not run the script as a normal user.
Can you please reverse the change to that file?
And you should merge in the upstream/master and resolve the conflicts.
If you need any assistance or help then do not hesitate to ask us

Moving cleanup's logic to DOI constructor.Add results of generate-authors.sh.
(revert AUTHORS)
@KunAndrew
Copy link
Contributor Author

I reverted the AUTHORS. I upgraded project from github. Then I merged in master (on my local machine) and pushed to remoted. But I have problem with resolving problems. "These conflicts to complex to resolve in the web editor." How should I do it? Should I create a new Pull Request (from my master-branch) and close the current one?

@Siedlerchr
Copy link
Member

@KunAndrew The best thing is to do this on the command line. You need to merge in the upstream the master branch from the jabref repository..

  1. Checkout your branch git checkout
  2. git merge upstream/master It will now tell you about conflicts
  3. open git gui to see the problems
    4 resolve them. and commit and push

If you happen to have still some problems or don't feel confident enough I can take over and do the conflict resolving

@KunAndrew
Copy link
Contributor Author

Yes, that would be great. I don't feel confident enough with git GUI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: changes required Pull requests that are not yet complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improvement of Duplicate Detection
5 participants