Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature:lib] Allow searching by last_updated_date #93

Merged
merged 2 commits into from
Apr 17, 2021

Conversation

xuanxu
Copy link
Contributor

@xuanxu xuanxu commented Mar 28, 2021

Checklist

  • All old and new tests pass (ran bundle exec rspec spec in the root directory).
  • Read the contribution guidelines.
  • Updated documentation (if necessary).

Reason (or issue)

ArXiv allows to include lastUpdatedDate in the search_query param.

Description

This PR adds support for using lastUpdatedDate in the search API

@xuanxu xuanxu changed the title Allow searching by last_updated_date [feature:lib] Allow searching by last_updated_date Mar 28, 2021
@eonu
Copy link
Owner

eonu commented Apr 16, 2021

Hey! Thanks again for the help :)

Looking into this, it seems that lastUpdatedDate doesn't belong in the search_query with other fields like title and author (at least based on what it says here), and therefore it might not be possible to use it in the way that you've implemented.

Instead, it can only be specified in sortBy. It looks like sorting by lastUpdateDate is already implemented in Arx (although it is incorrectly implemented as lastUpdated instead of lastUpdatedDate, and probably doesn't work because of this – I will fix this).

I think this means that with the arXiv search API, you can only sort by last updated date, but not search for a specific last updated date.

@eonu
Copy link
Owner

eonu commented Apr 16, 2021

Hmm, just had a further look and it seems that I'm wrong, and that you can actually search by last updated date, despite it not being documented on arXiv.

Instead of specifying a not very user-friendly range like search_query=lastUpdatedDate:[200712310900+TO+200712310959] in the usual way:

Arx.search do |q|
  q.last_updated_date '[200712310900+TO+200712310959]'
end

it would be cool if it could be specified as a DateTime range

Arx.search do |q|
  q.last_updated_date DateTime.new(2006, 1, 1)..DateTime.new(2008, 1, 1)
end

but maybe we can leave that for another PR, as it would require quite a lot to be changed.

It also looks like the same can be done with submittedDate, so maybe we can do that in another PR.

@xuanxu
Copy link
Contributor Author

xuanxu commented Apr 17, 2021

Hmm, just had a further look and it seems that I'm wrong, and that you can actually search by last updated date, despite it not being documented on arXiv.

Yeah, the arXiv API docs are quite incomplete.

Instead of specifying a not very user-friendly range like search_query=lastUpdatedDate:[200712310900+TO+200712310959] in the usual way
it would be cool if it could be specified as a DateTime range

That would be great!
I wanted to be consistent with the existing code accepting strings and respect the current structure.

but maybe we can leave that for another PR, as it would require quite a lot to be changed.
It also looks like the same can be done with submittedDate, so maybe we can do that in another PR.

I agree, we can think about how to accept different value types with the current workflow, maybe adding specific checks by method name in the validations or something like that.

@eonu
Copy link
Owner

eonu commented Apr 18, 2021

@xuanxu, just a note – instead of

Arx.search {|q| q.updated_at '[200712310900+TO+200712310959]'}

you have to remove the URL-encoded +, so

Arx.search {|q| q.updated_at '[200712310900 TO 200712310959]'}

@xuanxu xuanxu deleted the feature/search-by-last-updated-date branch April 27, 2021 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants