-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.x] Mixed orders in cursor paginate #37762
[8.x] Mixed orders in cursor paginate #37762
Conversation
0638ded
to
cce9fe3
Compare
Ping @paras-malhotra |
@halaei, the reason I didn't take this path is because of the changing behaviour of index usage based on the DB engine. I guess if we construct queries like this, we'll need to check if all 4 databases are able to use: (a) a composite index and (b) an index on the first column if a composite index is missing, on the queries constructed. The benefit of the tuple comparison operator is that the DB engine takes care of this automatically. I haven't looked into index support for database engines but you may find this article useful: https://use-the-index-luke.com/sql/partial-results/fetch-next-page |
I am not sure about database support either. Ideally DB engines should be able to find and use the best index in the most efficient way without any hint, but I don't know if they are all as perfect as we need them to be. Regardless of this PR, I would really like to know how DB engines work in this regard. One important consideration is that according to sql-workbenck, tuple comparison with |
I'd recommend we stick to the tuple comparison to avoid extra complexity on constructing queries for index usage. If we go with this PR, we should test it or check out the RDBMS docs to ensure the index is actually used. The whole idea of cursor pagination is for better query performance, and if it messes up the execution plan, in the end it may not be worth it from a performance perspective when compared to offset pagination. For the issue with the incompatibility with SQL Server, I think we can just mention that in the docs. This way, at least we're sure that cursor pagination does in fact increase query performance for the databases that are supported. |
According to MySQL optimization docs, there are scenarios where using row constructor, e.g. tuples, actually degrade the performance. So maybe not using tuple comparison is actually a performance optimization! I try to check it with some benchmark. |
My benchmark result for MySQL 8.0.25: TableCREATE TABLE `foos` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`foo` int NOT NULL,
`bar` int NOT NULL,
`baz` int NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `foos_foo_bar_baz_index` (`foo`,`bar`,`baz`)
) ENGINE=InnoDB AUTO_INCREMENT=2000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; DataThe table is seeded with 2 million records, with for ($foo = 0; $foo < 10; $foo++) {
for ($bar = 0; $bar < 10; $bar++) {
for ($i = 0; $i < 20000; $i += 100) {
$values = [];
$date = now();
for ($baz = $i; $baz < $i + 100; $baz++) {
$values[] = [
'foo' => $foo,
'bar' => $bar,
'baz' => $baz,
'created_at' => $date,
'updated_at' => $date,
];
}
\App\Models\Foo::query()->insert($values);
}
}
} QueriesIn the following queries we try to get 50 elements from the near end of the index, to make sure they will be slow unless they perfectly use the multi-column index. 1. Paginating all items:1.1. Using single column comparison:select * from `foos` where (`foo` > 9 or (`foo` = 9 and (`bar` > 9 or (`bar` = 9 and (`baz` > 19000))))) order by `foo` asc, `bar` asc, `baz` asc limit 51; Runtime: less than 2 milliseconds. 1.2. Using tuple comparison:select * from `foos` where (`foo`, `bar`, `baz`) > (9, 9, 19000) order by `foo` asc, `bar` asc, `baz` asc limit 51 Runtime: 16 seconds. 2. Paginating items with
|
@paras-malhotra any other thoughts on this? |
Based on the analysis by @halaei and #37216 (comment), it does seem that this PR will improve performance for mySQL but I'm not sure how it compares for PostgreSQL. I guess, since there are more mySQL users than PostgreSQL and also the fact that SQL Server isn't supported by the current cursor pagination, I think it makes sense to accept this PR. 👍 |
For the record, MySQL has confirmed the performance bug I reported as "a duplicate of an internally-filed bug report, which is not yet scheduled for fixing": |
Do we have any integration tests on this feature that actually hit the database? |
@taylorotwell there are some tests here: framework/tests/Database/DatabaseEloquentIntegrationTest.php Lines 324 to 402 in 855a919
|
@halaei can you rebase with 8.x please? |
cce9fe3
to
8d656a8
Compare
Can you mark this as draft until all your checkbox tasks are complete? |
The only remaining checkbox can only be done after this PR is merged. And I don't think it is really required to "remove the redundant parentheses generated by this PR", but let me know if I you need me to work on it. |
It is possible to remove the following limitation from
cursorPaginate()
function:Example:
A desired SQL statement:
Todo
CursorPaginationException
in 9.x.Remove the redundant parentheses generated by this PR.Tests
I tried to fix tests by checking the resulting query and bindings every time the mocked
get()
function is called.