-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.x] Add cursor pagination (aka keyset pagination) #37216
Conversation
I've very happy to see this! I do remember I added Can you elaborate what's the deal with the post-processing of the base64 encoded values? I'm not sure it's obvious, I guess a doc comment would be nice. |
Awesome! I was glad to see it already existed when I started working on this.
If you mean why |
Makes sense!
Thanks, that part was clear to understand. But now that you mention it 😅: I'm always a sucker for validating the data; using encryption might be overkill and causes problems because it generates different outputs for the same inputs but I'm always a fan of making sure the payload, before I decode it, is safe enough. Could this be a suspect to some attack vectors I wonder? At least using hmac for integrity checks might be something to consider, but I checked https://github.com/basecamp/geared_pagination/blob/master/lib/geared_pagination/cursor.rb and there seems no additional protection either. I guess "it's fine 🔥" then. |
@mfn yep it's secure. I checked other implementations as well. We're just base64 decoding and then json decoding the cursor string (no unserialization, etc.), so it's safe. If it's unable to decode, a null cursor is returned. |
@paras-malhotra I support this idea, and if it is not wanted in the framework core, I'd be happy to help you turn this into a package for the community to use. :) |
That would be awesome @GrahamCampbell! I'd love to have this in the core, but if that's not possible, would very much appreciate your help to turn this into a package :) |
Ping @spawnia: this will be useful for Lighthouse as well I assume. Is there anything you can think of that can be taken in on this PR? |
Would this implementation also make it possible to generate/determine the cursor of a specific item? For example, if a chat app allows you to search in the messages of a chat and you want to navigate the user to a selected message somewhere in the history of the chat. And from the location of the message, the user can infinitely scroll up and down in the chat (similar to the search-function in the Slack app) |
@paras-malhotra I am confused by the description, should the sections about advantages and limitations be about keyset pagination? @driesvints thanks for the ping. We have had this issue in Lighthouse for quite some time, see nuwave/lighthouse#311. Nice to see that this might make it into core, definitely interesting. The API seems suitable for our purposes. I would have to do a proof-of-concept to be sure, but |
Uhh, ok I'm stupid lol. Sorry about that 🤦♂️ Corrected now. |
@gdebrauwer, yes. This PR takes care of that. You can construct a cursor for any item and direction like so: use Illuminate\Pagination\Cursor;
// For single column orders, e.g. order by id.
$cursor = new Cursor(['id' => 2], true); // generate cursor for id > 2
$cursor = new Cursor(['id' => 3], false); // generate cursor for id < 3
// For multiple column orders, e.g. order by id, name
$cursor = new Cursor(['id' => 2, 'name' => 'Paras'], true); // generate cursor for (id, name) > (2, 'Paras')
$cursor = new Cursor(['id' => 3, 'name' => 'Paras'], false); // generate cursor for (id, name) < (3, 'Paras') |
Thanks @spawnia, I've incorporated your suggested changes. |
@paras-malhotra Please provide a general overview of how the feature works internally so I have something to go by when reviewing it. |
I think some example URLs for the first page and second page would be very helpful. :) |
@taylorotwell and @GrahamCampbell, sure thing. Here's how it works internally: Step 1: Resolve the cursor
The cursor is akin to the "page number". A cursor object contains the parameter values along with the direction as mentioned in this comment. So, first the cursor is resolved from the request. An example of a URL would be So, to decode we'd first base64 decode to get All the encoding and decoding logic is encapsulated in the Step 2: Ensure order by is set properlyframework/src/Illuminate/Database/Eloquent/Builder.php Lines 867 to 883 in 6611204
Here, we first ensure:
If the cursor points backwards ( So, a forward query (from page 2 to page 3, 10 items per page) would look like: select * from users where id > 20 order by id asc limit 11; And a backwards query (from page 3 to page 2, 10 items per page) would look like (note direction is reversed from select * from users where id < 21 order by id desc limit 11; Also, note limit is Step 3: Create the CursorPaginator instanceAfter applying the correct order and where clauses, we fire the query and create the cursor paginator. The order by columns are passed as framework/src/Illuminate/Database/Eloquent/Builder.php Lines 851 to 855 in 6611204
Once the cursor paginator is created, we save the collection and reverse the order of the collection if the cursor is pointing backwards as here: framework/src/Illuminate/Pagination/CursorPaginator.php Lines 61 to 63 in 6611204
This reversal is to preserve the order in next and previous. For example, the items returned from the backwards query (page 3 to page 2) in step 2 would be Step 4: Compute the URLs and cursors for next and previous pagesframework/src/Illuminate/Pagination/AbstractCursorPaginator.php Lines 162 to 170 in 6611204
Finally, we compute the next and previous cursor (and corresponding URLs). The next cursor would contain the parameters of the last paginated item in a forward direction, and the previous cursor would contain the parameters of the first paginated item in a backwards direction. So, for example, say we're on page 2 (10 items per page). The next cursor should be That's it in a nutshell. Hope the explanation was helpful! Let me know if you have any further questions. |
@paras-malhotra if you could send over your docs draft that would be good. |
https://github.com/juampi92/cursor-pagination @GrahamCampbell |
@ahmedatef00, if you're asking for differences in implementation, I haven't used that package. But at first glance, it doesn't seem to support multiple order by clauses, rendering of links or URL encode values (which can be an issue if sorting is done by strings). I could be wrong here since I haven't really taken it for a spin. I'd advise you to use your own judgement. |
I just paste it for the sake of knowledge ... and Yes maybe you are right I used it before with lumen and it works fine in one single order by clause but I can't say it is good for all corner cases ... I think your's will be better and I can't wait to use it. |
* Add cursor pagination without tests * Fix styleci * Add cursor paginator tests * Add support for query builder * Fix tests * Complete all tests for database and Eloquent builders * Incorporate suggestions * Fix styleci * Fix docblocks * move method * Fix docblock * Formatting * Various formatting - method renaming. * Add more tests Co-authored-by: Taylor Otwell <taylorotwell@gmail.com>
This may be a dumb question, but does this work with API Resources? For example, will it show the cursor data in the |
No big deal, but next time please give credit. https://use-the-index-luke.com/no-offset |
@fatalmind, my bad. Just saw the attribution license on your website. I've updated my post to include the source/link. |
According to SQL Feature Comparison, SQLServer does not support Tuple Comparison syntax. So (a, b, c) > (1, 2, 3) should be rewritten to a=1 and b=2 and c>3
or
a=1 and b>2
or
a>1 If you use SQLServer, still lampager/lampager-laravel: Rapid pagination for Laravel may help implementing cursor pagination. |
Keyset pagination really does not work on SQL Server: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=dbf29cc91f08c2185bde4a521c3508e5 But i am not sure how efficient SQL Server will execute this DNF condition, maybe it's best to state in the manual that is not compatible with SQL Server. |
Emulation of Row Value syntax is a hassle, but still possible with most of the benefits of Keyset pagiation (also performance). The crutial point is to make the leftmost condition "indexable". E.g.
becomes
If there is in index on See this for more details: https://use-the-index-luke.com/sql/partial-results/fetch-next-page#sb-row-values |
I've actually verified MySQL 5.6 and 5.7 represents Index does not work (expected behavior): a<1 and b=2 and c>3
or
a<1 and b>2
or
a>1 Index works well with a=1 and b=2 and c>3
or
a=1 and b>2
or
a>1 |
Unfortunately, the MySQL EXPLAIN output can be pretty confusing. "Using index condition" just means that it does the filtering directly with the information found in the index, which is better than first fetching the full row from the table and then doing the filtering (Extra: Using Where). However, filtering in this context means checking whether or not a row matches after reading it. For performance it is actually desired to not even read the rows that we don't need. This is what is sometimes referred to access (not filter). Unfortunately, MySQL cannot access indexes with row-value predicates. But if we phrase the WHERE condition as described in my article — which is to the very best of my knowledge still current, even for MySQL 8.0 — than the first column can be used to access the index, while the remaining ones can only be used for filtering. Of course it is better to do that filtering before accessing the table ("Using Index Condtion") than after that ("Using Where"). But using at least one column for acess is still desirable. If you want to check this out, compare your second example with one you build like described in the mentioned article. The crutial information to watch out in the EXPLAIN output is the "ref" (and maybe "key_len") columns. They indicate what part of the index was used for access. We want as much as possible form an index on (a, b, c) to be used as access predicates. In the "ref" column this is visible by the number of items it lists (comma-separated), in the key_len column a higher value is better. See also: https://use-the-index-luke.com/sql/explain-plan/mysql/access-filter-predicates |
@fatalmind I've tried to estimate using actual production table data. How do we evaluate them? The above one looks to have larger explain
select * from posts use index(posts_community_id_commented_at_index)
where
community_id=17 and commented_at="2015-12-12 08:41:05" and id>547
or community_id=17 and commented_at>"2015-12-12 08:41:05"
or community_id>17
limit 5;
explain
select * from posts use index(posts_community_id_commented_at_index)
where
community_id >=17
and not (community_id=17 and commented_at<"2015-12-12 08:41:05")
and not (community_id=17 and commented_at="2015-12-12 08:41:05" and id<=547)
limit 5;
|
Hi! Your queries have one important mistake: they miss the ORDER BY clause ;) But that doesn't change a lot in the execution plan. Another small mistake in the second query is that Regarding performance: The long story: In MySQL, also in old versions, the performance difference between the two approaches can be seen by looking at how many IO operations the database does to run the query. (I gave up making sense from the EXPLAIN output for this purpose). After adding the ORDER BY clause, the first query needs 6 read operations while the second only needs 5. Not a really big difference, both are basically fine. The same experiment on SQL Server gives 6 read operations for the first, but only 3 for the second query. They are still both VERY fast compared to offset, but the second is even faster. The same experiment on Oracle, which also doesn't have decent row-value support, gives 102 IOs for the first query but only 4 for the second. What I want to say is that there is a pattern: the one approach is always better than the other one. While it might be marginally better in some cases, it's never slower. That's why I recommend going for this approach. For reference & your enjoyment I'm adding five files: three the test scripts for MySQL, SQL Server and Oracle, and also the output for SQL Server & Oracle as I don't know whether you have access to these systems at the moment. I hope this helps. And thanks for following up and coming back with reasonable questions! keyset_demo.oracle.out.txt |
@fatalmind Thank you for estimating the problem for us! |
@fatalmind Interesting comment: #37762 (comment). In MySQL, due to the bug, OR-AND conditions sometimes perform faster than Tuple-Comparison ones. |
Background
This PR implements cursor pagination in Laravel. The cursor is a base 64 encoded string that contains the comparison parameter values (see below).
Laravel Current Pagination (Offset Pagination)
Laravel's current implementation of pagination is offset based. This generates queries like so (for the 2nd page):
OR for a multiple ordered table
Cursor Pagination (aka Keyset Pagination)
Cursor pagination on the other hand, uses comparison operations instead of offset. This generates queries like so (for the 2nd page):
OR for a multiple ordered table
Usage
Usage is exactly the same as
simplePaginate
:Advantages Of Cursor Pagination
Limitations Of Cursor Pagination
References
Implementations In Other Frameworks
Note
I've implemented this as separate classes (for the interface, abstract, etc.) rather than extending Paginator because several methods in the contract/abstract classes did not make sense for cursor pagination (e.g. currentPage() returning an int, or url expecting an int page).
UPDATE: I've written a blog post on this if anyone's interested to read the pros and cons of offset and cursor pagination.