Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for issue 7633: Unable to download arXiv pdfs if Title contains curly brackets #7652

Merged
merged 9 commits into from
Apr 20, 2021
Merged

Conversation

Pikayue11
Copy link
Contributor

@Pikayue11 Pikayue11 commented Apr 20, 2021

Fix #7633

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

Reproduce the issue:

1 Open library, select the .bib file as below

@Article{zhang_machine_2021v2,
  author  = {Zhang, Ruohan and Guo, Sihang and Liu, Bo and Zhu, Yifeng and Hayhoe, Mary and Ballard, Dana and Stone, Peter},
  journal = {arXiv:2010.15942v2 [cs]},
  title   = {Machine versus {Human} {Attention} in {Deep} {Reinforcement} {Learning} {Tasks}},
  year    = {2021},
  month   = feb,
  url     = {http://arxiv.org/abs/2010.15942v2},
}

2 Double click the entry and then click the Get fulltext on the right of **general | file **, it will give a message of "No full text document found"

3 And if we erase the curly brackets in the title, with other keep the same, it will download the pdf file successfully , the new title is given below

Machine versus Human Attention in Deep Reinforcement Learning Tasks

After reading the source code, I thought that it is the case that doi is not present and we need to use the author and title to query the url of the pdf file. However, the title with curly brackets may fail to search a arXiv entry, or the search is success but the title fails to match the arXiv title.

Process to fix the issue:

1 The arXiv-related work is in the ArXiv.java

2 Then I found the url of pdf is defined in the function findFullText, and it further invoke function searchForEntries

3 I defined a method called ignoreCurlyBracket in StringUtil.java to erase the curly brackets in the title if the title is not blank.

4 In the function searchForEntries, I invoke ignoreCurlyBracket to fix the title before it serve as a parameter to search arXiv entry and before the comparison with arXiv title.

Screenshots of the result:

Before

image-20210419154425138

After

Download successfully

image-20210419205905217

image-20210419210151694

Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Please fix the checkstyle issues as reported by our review dog.

@Siedlerchr
Copy link
Member

There's also an architecture violation, as StringUtil is increased in size

@Pikayue11
Copy link
Contributor Author

So is it not allowed to add anything in StringUtil? Shall I simply move the function from StringUtil to Arxiv?

@Pikayue11
Copy link
Contributor Author

As for the checkstyle, I just fix a part of it. The only warning message remains is " '{' is not preceded with whitespace“, but it may cause some problems if I add a white space before '{', as my sentence is: replace("{", "")

@Siedlerchr
Copy link
Member

@Pikayue11 The checkstyle is referring to the method definition (look at the changes tab, reviewdog makrs the line)

@Pikayue11
Copy link
Contributor Author

I'm sorry, now the checkstyle should be correct.

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm

@Siedlerchr Siedlerchr merged commit 23d7573 into JabRef:main Apr 20, 2021
@Siedlerchr
Copy link
Member

the unit tests failing are relevant to the help. something got wrong there

Siedlerchr added a commit that referenced this pull request Apr 24, 2021
…om.tngtech.archunit-archunit-junit5-api-0.18.0

* upstream/main:
  Fix exception when searching (#7659)
  Fixes Jabref#7660 (#7663)
  Fix for issue 5850: Journal abbreviations in UTF-8 not recognized (#7639)
  Fix SSLHandshake Exception by using bypass (#7657)
  Fix for issue 7633: Unable to download arXiv pdfs if Title contains curly brackets (#7652)
  Fix#7195 partly Opacity of disabled icon-buttons
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to download arXiv PDFs if Title contains curly brackets
3 participants