-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return None in get_paper and get_papers when data is none instead of failing to construct a Paper #80
Comments
@nathimel Thanks for pointing out the issue. If you could share an example of when this error happens, that'd be helpful for testing. |
Rather than creating a new issue, I believe I'm having the same bug. Part of the issue outside of the package is that papers seem to disappear in SS from time to time. But this makes it impossible to bulk download because if one paper doesn't exist out of 500, the whole function call fails: Reproducible code:
|
Hi @kochbj, I believe the issue you're facing with A better approach might be for In contrast, @nathimel's issue is different. It's not about missing IDs but about receiving a response with no data, leading to failures. |
Hi @danielnsilva , thanks for the quick reply! I believe the solution you proposed would be great. My use case is that I want to download metadata for 200,000 papers using ArxivIds. If I can do that for 500 papers at a time great. But if even 40 papers don't have IDs, I have to go back to doing 50 at a time, then 10 at a time, then 1 at a time until I find the bad ID. It uses up my API calls. If you could simply return None and/or throw a warning that would be splendid! |
Update the get_papers() method to support returning a list of not found paper IDs. When the return_not_found parameter is set to True, the method now returns a tuple containing both a list of found papers and a list of not found IDs.This enhancement addresses the issue where handling of missing papers was not clear.
Update the get_papers() method to support returning a list of not found paper IDs. When the return_not_found parameter is set to True, the method now returns a tuple containing both a list of found papers and a list of not found IDs.
@kochbj I've fixed pip install --no-cache-dir --force-reinstall git+https://github.com/danielnsilva/semanticscholar@issue-80 from semanticscholar import SemanticScholar
sch = SemanticScholar()
list_of_paper_ids = [
'CorpusId:211530585',
'CorpusId:470667',
'10.2139/ssrn.2250500',
'0f40b1f08821e22e859c6050916cec3667778613'
]
list_of_papers, list_of_not_found_ids = sch.get_papers(list_of_paper_ids, return_not_found=True) |
Wonderful. Can't wait to try it out. :) Thanks! |
I couldn't reproduce @nathimel's issue. I'm closing this issue for now, but feel free to reopen it if necessary. |
I am using semanticscholar to get a large number of papers while traversing the S2AG iteratively, and sometimes this results in queries with
SemanticScholar.get_paper
that result indata
beingNone
.This causes an error here in
Paper._init_attributes
of course, because that method assumes thatdata
is a dict.As a temporary workaround, I've writing custom
get_paper
andget_papers
methods that change the linereturn Paper(data)
toreturn Paper(data) if data is not None else None
. (See here for an example.) Otherwise I like to use semanticscholar as is. But this gets tedious to maintain in parallel with updates to semanticscholar; for example, I now need to write more complicated functions to keep up with AsyncSemanticScholar.Would the developers consider returning None, or otherwise not causing SemanticScholar to throw an error that would stop a loop that I might not be monitoring?
I suppose the alternative is to include try except blocks in my code and manually return None, but that seems uglier. On the other hand, the developers might feel it is preferred. Thanks for your consideration.
The text was updated successfully, but these errors were encountered: