Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type, limit, offset, min_id, max_id, account_id to search API #10091

Merged
merged 5 commits into from
Feb 26, 2019

Conversation

Gargron
Copy link
Member

@Gargron Gargron commented Feb 20, 2019

Fix #8939

Note: The search controller was running authorize on each status, but SearchService already runs StatusFilter on each status, which calls StatusPolicy internally, so the same work was done twice.

type allows specifying what kind of results you expect back, to omit the unneeded ones. Now customizable limit (previously: always 5; now: 0-40), as well as offset allows paginating. The following params are only available for statuses search: min_id and max_id allow filtering results by date ranges, and account_id allows narrowing down results by author.

@Gargron Gargron added the api REST API, Streaming API, Web Push API label Feb 20, 2019
@nolanlawson
Copy link
Contributor

Does offset introduce potential problems if the search results change? For instance:

  1. I ask for the first 20 rows
  2. The search results change in the underlying DB
  3. Now when I ask for offset 20, I see duplicates and/or missing results.

I'm not sure if Postgres offers a good way to solve this problem, though. That's the only potential issue with this API that I see, LGTM otherwise. 👍

@apetresc
Copy link

Indeed, limit-offset-based pagination for something as fast-moving as toot search results is virtually guaranteed to run into the exact problem @nolanlawson is describing.

The standard solution is cursor-based pagination. For example, here’s how the Slack API solves a very analogous problem.

@Gargron
Copy link
Member Author

Gargron commented Feb 21, 2019

something as fast-moving as toot search results

Mastodon search is anything but fast-moving. The results are limited to owned/marked content.

Keyset pagination is what we use everywhere else, but there are a few obstacles to doing it here: Search results are not sorted by primary ID, which makes keyset pagination impossible. While ElasticSearch seems to support a search_after param, the other search types do not, plus min_id/max_id are reserved for the date range filter, so it's unclear what param to use to pass that value, not to mention that clients would need to pass an ElasticSearch-internal value which is not exposed through our API in the first place.

I've checked Discord's network traffic and it's using offset/limit for pagination too.

@Gargron
Copy link
Member Author

Gargron commented Feb 24, 2019

Any more comments? Wishlists? I could add more params like to filter by if a post has images or a video, if anyone needs that. Let me know

@BenLubar
Copy link
Contributor

It would be nice to have a way of searching posts that were recently in the home timeline in addition to posts that I've interacted with.

I frequently find myself unable to reply to a post that linked to something because I can't find it.

@witcheslive
Copy link

Looking forward to these changes, thank you!

I would like to second a consideration to relax some of the limitations on the search. It would be nice if admins could search their own instance (even if it was limited to a backend interface) or you could search everything in the last day to look for things you might have missed as they flew by but also couldn't mine.

The limitations are good overall but with alternatives like Pleroma having much more complete search (also certain Mastodon modifying their instances to allow unrestricted search) and that it doesn't seem too difficult to go directly to postgres or elasticsearch as an admin and run queries (I honestly haven't even tried but it does not seem hard at all) it might be a good idea to relax some of the limitations to make it more useful and remove a large chunk of the temptation to make some of these more far-reaching modifications.

@Gargron
Copy link
Member Author

Gargron commented Feb 25, 2019

It would be nice to have a way of searching posts that were recently in the home timeline in addition to posts that I've interacted with.

A bit out of scope here. Search permissions are implemented using a searchable_by array of account IDs on each post. For your suggestion, every post would need to have every follower's account ID in that array, which is a lot of data.

@witcheslive
Copy link

A bit out of scope here. Search permissions are implemented using a searchable_by array of account IDs on each post. For your suggestion, every post would need to have every follower's account ID in that array, which is a lot of data.

I don't know a lot about elasticsearch queries to be fair, but is there no way to make the restrictions on the query be "it's searchable_by OR has a timestamp within the last 24 hours"?

@Gargron Gargron merged commit e7f20cc into master Feb 26, 2019
@ykzts ykzts deleted the feature-advanced-search branch February 26, 2019 14:30
hiyuki2578 pushed a commit to ProjectMyosotis/mastodon that referenced this pull request Oct 2, 2019
…stodon#10091)

* Add type, limit, offset, min_id, max_id, account_id to search API

Fix mastodon#8939

* Make the offset work on accounts and hashtags search as well

* Assure brakeman we are not doing mass assignment here

* Do not allow paginating unless a type is chosen

* Fix search query and index id field on statuses instead of created_at
messenjahofchrist pushed a commit to Origin-Creative/mastodon that referenced this pull request Jul 30, 2021
…stodon#10091)

* Add type, limit, offset, min_id, max_id, account_id to search API

Fix mastodon#8939

* Make the offset work on accounts and hashtags search as well

* Assure brakeman we are not doing mass assignment here

* Do not allow paginating unless a type is chosen

* Fix search query and index id field on statuses instead of created_at
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api REST API, Streaming API, Web Push API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants