From d93a039377cf3593a2b61ef96cb38e2f19b10921 Mon Sep 17 00:00:00 2001 From: Shay Rojansky Date: Wed, 19 Jan 2022 12:36:25 +0100 Subject: [PATCH] Warnings for deterministic ordering In split query, and also a stronger note in the pagination page. Closes #3242 --- entity-framework/core/querying/pagination.md | 6 +++--- entity-framework/core/querying/single-split-queries.md | 3 +++ 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/entity-framework/core/querying/pagination.md b/entity-framework/core/querying/pagination.md index d4e17effe6..7d325a5049 100644 --- a/entity-framework/core/querying/pagination.md +++ b/entity-framework/core/querying/pagination.md @@ -9,6 +9,9 @@ uid: core/querying/pagination Pagination refers to retrieving results in pages, rather than all at once; this is typically done for large resultsets, where a user interface is shown that allows the user to navigate to the next or previous page of the results. +> [!WARNING] +> Regardless of the pagination method used, always make sure that your ordering is fully deterministic. For example, if results are ordered only by date, but there can be multiple results with the same date, then results could be skipped when paginating as they're ordered differently across two paginating queries. Ordering by both date and ID (or any other unique property) makes the ordering fully deterministic and avoids this problem. Note that relational databases do not apply any ordering by default, even on the primary key; queries without explicit ordering have non-deterministic resultsets. + ## Offset pagination A common way to implement pagination with databases is to use the `Skip` and `Take` (`OFFSET` and `LIMIT` in SQL). Given a a page size of 10 results, the third page can be fetched with EF Core as follows: @@ -30,9 +33,6 @@ Assuming an index is defined on `PostId`, this query is very efficient, and also Keyset pagination is appropriate for pagination interfaces where the user navigates forwards and backwards, but does not support random access, where the user can jump to any specific page. Random access pagination requires using offset pagination as explained above; because of the shortcomings of offset pagination, carefully consider if random access pagination really is required for your use case, or if next/previous page navigation is enough. If random access pagination is necessary, a robust implementation could use keyset pagination when navigation to the next/previous page, and offset navigation when jumping to any other page. -> [!WARNING] -> Always make sure that your ordering is fully deterministic. For example, if results are ordered only by date, but there can be multiple results with the same date, then results could be skipped when paginating as they're ordered differently across two queries. Ordering by both date and ID (or any other unique property) makes the resultset deterministic and avoids this problem. Note that relational databases do not apply any ordering by default, even on the primary key; queries without explicit ordering have non-deterministic resultsets. - ### Multiple pagination keys When using keyset pagination, it's frequently necessary to order by more than one property. For example, the following query paginates by date and ID: diff --git a/entity-framework/core/querying/single-split-queries.md b/entity-framework/core/querying/single-split-queries.md index d148bd73da..dfe4c1bf7f 100644 --- a/entity-framework/core/querying/single-split-queries.md +++ b/entity-framework/core/querying/single-split-queries.md @@ -42,6 +42,9 @@ INNER JOIN [Post] AS [p] ON [b].[BlogId] = [p].[BlogId] ORDER BY [b].[BlogId] ``` +> [!WARNING] +> When using split queries, pay special attention to making your query ordering fully deterministic; not doing so could cause incorrect data to be returned. For example, if results are ordered only by date, but there can be multiple results with the same date, then each one of the split queries could each get different results from the database. Ordering by both date and ID (or any other unique property) makes the ordering fully deterministic and avoids this problem. Note that relational databases do not apply any ordering by default, even on the primary key; queries without explicit ordering have non-deterministic resultsets. + > [!NOTE] > One-to-one related entities are always loaded via JOINs in the same query, as it has no performance impact.