-
-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add type, limit, offset, min_id, max_id, account_id to search API #10091
Conversation
Does
I'm not sure if Postgres offers a good way to solve this problem, though. That's the only potential issue with this API that I see, LGTM otherwise. 👍 |
Indeed, limit-offset-based pagination for something as fast-moving as toot search results is virtually guaranteed to run into the exact problem @nolanlawson is describing. The standard solution is cursor-based pagination. For example, here’s how the Slack API solves a very analogous problem. |
Mastodon search is anything but fast-moving. The results are limited to owned/marked content. Keyset pagination is what we use everywhere else, but there are a few obstacles to doing it here: Search results are not sorted by primary ID, which makes keyset pagination impossible. While ElasticSearch seems to support a search_after param, the other search types do not, plus min_id/max_id are reserved for the date range filter, so it's unclear what param to use to pass that value, not to mention that clients would need to pass an ElasticSearch-internal value which is not exposed through our API in the first place. I've checked Discord's network traffic and it's using offset/limit for pagination too. |
Any more comments? Wishlists? I could add more params like to filter by if a post has images or a video, if anyone needs that. Let me know |
It would be nice to have a way of searching posts that were recently in the home timeline in addition to posts that I've interacted with. I frequently find myself unable to reply to a post that linked to something because I can't find it. |
Looking forward to these changes, thank you! I would like to second a consideration to relax some of the limitations on the search. It would be nice if admins could search their own instance (even if it was limited to a backend interface) or you could search everything in the last day to look for things you might have missed as they flew by but also couldn't mine. The limitations are good overall but with alternatives like Pleroma having much more complete search (also certain Mastodon modifying their instances to allow unrestricted search) and that it doesn't seem too difficult to go directly to postgres or elasticsearch as an admin and run queries (I honestly haven't even tried but it does not seem hard at all) it might be a good idea to relax some of the limitations to make it more useful and remove a large chunk of the temptation to make some of these more far-reaching modifications. |
A bit out of scope here. Search permissions are implemented using a searchable_by array of account IDs on each post. For your suggestion, every post would need to have every follower's account ID in that array, which is a lot of data. |
I don't know a lot about elasticsearch queries to be fair, but is there no way to make the restrictions on the query be "it's searchable_by OR has a timestamp within the last 24 hours"? |
…stodon#10091) * Add type, limit, offset, min_id, max_id, account_id to search API Fix mastodon#8939 * Make the offset work on accounts and hashtags search as well * Assure brakeman we are not doing mass assignment here * Do not allow paginating unless a type is chosen * Fix search query and index id field on statuses instead of created_at
…stodon#10091) * Add type, limit, offset, min_id, max_id, account_id to search API Fix mastodon#8939 * Make the offset work on accounts and hashtags search as well * Assure brakeman we are not doing mass assignment here * Do not allow paginating unless a type is chosen * Fix search query and index id field on statuses instead of created_at
Fix #8939
type
allows specifying what kind of results you expect back, to omit the unneeded ones. Now customizablelimit
(previously: always 5; now: 0-40), as well asoffset
allows paginating. The following params are only available for statuses search:min_id
andmax_id
allow filtering results by date ranges, andaccount_id
allows narrowing down results by author.