Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination is broken when Include is used. #20431

Closed
JakubFojtik opened this issue Mar 27, 2020 · 4 comments
Closed

Pagination is broken when Include is used. #20431

JakubFojtik opened this issue Mar 27, 2020 · 4 comments
Assignees
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@JakubFojtik
Copy link

JakubFojtik commented Mar 27, 2020

A query using an Include and Skip and Take creates SQL with the pagination keywords nested into the first Include JOIN query.
This makes pagination work wrong, skipping e.g. 100 rows when Skip(1) is called (if there are 100 rows that would join with that 1 skipped base row).

dbContext.Set<Workplace>().AsQueryable()
  .Include(x => x.Manager)
  .Skip(10).Take(20)

This gets roughly translated to nested pagination:

SELECT * FROM (
    SELECT * FROM Workplace w
    ORDER BY (SELECT 1)
    OFFSET 10 ROWS FETCH NEXT 20 ROWS ONLY
) AS [t0]
LEFT JOIN Manager m ON (...)

Which returns different results than the expected outer pagination form:

SELECT * FROM (
    SELECT * FROM Workplace w
) AS [t0]
LEFT JOIN Manager m ON (...)
ORDER BY (SELECT 1)
OFFSET 10 ROWS FETCH NEXT 20 ROWS ONLY

The problem is that when executing the EF query subsequently for next pages of data, it can return empty or smaller pages, while the following pages would still have data.
Also since skip is applied too soon, a Skip of 1 can result in skipping all rows it would then join with

I guess one could first get a COUNT and SKIP forever until getting enough rows.
But wouldn't that be vulnerable to a row being deleted by someone else at the same time?

Shouldn't the pagination clauses be on the outside of the query, at least to be intuitive?

The EF Core query uses the default ORDER BY (SELECT 1) discussed in #19870, but that does not seem to cause the problem. Even with a column to ORDER BY the query still returns less results than specified in TAKE, while returning thousands without the paging.

Further technical details

EF Core version: 3.1.2
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019 16.3

@JakubFojtik JakubFojtik changed the title Query with Include and Skip has Skip nested counter-intuitively Pagination is broken when Include is used. Mar 27, 2020
@ajcvickers
Copy link
Contributor

@maumar to take a look.

@maumar
Copy link
Contributor

maumar commented Mar 30, 2020

@JakubFojtik can you share code listings of your entity classes and DbContext (especially OnModelCreating)?

Based on what you provided so far I reverse engineered the model to something like this:

        public class Workplace
        {
            public int Id { get; set; }
            public int? ManagerId { get; set; }
            public Manager Manager { get; set; }
        }

        public class Manager
        {
            public int Id { get; set; }
            public List<Workplace> Workplaces { get; set; }
        }


        public class MyContext : DbContext
        {
            public DbSet<Workplace> Workplaces { get; set; }
            public DbSet<Manager> Managers { get; set; }


            protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
            {
                optionsBuilder.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Repro20431;Trusted_Connection=True;MultipleActiveResultSets=true");
            }
        }

It does produce the sql that you are reporting, but that is expected and correct sql for the query. There can be only one Manager corresponding to any Workplace so actually putting OFFSET before of after the join yields the same results, but if you query the other side of relationship (assuming manager can manage multiple workplaces):

                var query = ctx.Set<Manager>().AsQueryable()
                    .Include(x => x.Workplaces)
                    .Skip(10).Take(20).ToList();

you want to apply OFFSET operation on the manager's table, before joining the workspaces - i.e. the query is supposed to paginate over the managers and include all of their workspaces.

If your intention is (looking at the query I wrote above) to return all the managers but only return their paginated workspaces, then it's a filtered include scenario, which is currently not supported but we just implemented it and it will be available in the next preview - details here: #1833

@JakubFojtik
Copy link
Author

I am sorry, bug was in my code as I should have expected in the first place.
The cause was simply me forgetting to define FK relationships as nullable.

This understandably resulted in INNER joins instead of LEFT joins and the loss of rows observed.
I was wondering how could LEFT join possibly lose rows, and today I found a later commit correcting the relationship nullability.

I was also looking for the paginated relationship feature, thank you for that!

Sorry again for the useless issue report.

@ajcvickers ajcvickers added closed-no-further-action The issue is closed and no further action is planned. and removed area-query labels Apr 5, 2020
@EnmanuelParedesR
Copy link

EnmanuelParedesR commented Aug 19, 2020

Hey, I want to open this issue again. I'm having the same issue. I want to set the pagination after the joins and sub queries.

Code:

 var users = await _enlineaDbContext.Usuario
            .Where(u => !u.Borrado)
            .Include(u => u.Provincia)
            .Include(u => u.Perfil.PerfilRole)
           
            .Include(u => u.UsuarioRole)
              
               .OrderBy(x => x.Activo)
               .Skip(pageSize * (pageNumber - 1))
               .Take(pageSize)
              .ToListAsync().ConfigureAwait(false);

This generates this code [columns removed]:

SELECT *
FROM (
    SELECT *
    FROM [Usuario] AS [u]
    WHERE [u].[Borrado] <> CAST(1 AS bit)
    ORDER BY [u].[Activo]
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
INNER JOIN [Provincias] AS [p] ON [t].[ProvinciaId] = [p].[Id]
INNER JOIN [Perfil] AS [p0] ON [t].[PerfilId] = [p0].[Id]
LEFT JOIN [PerfilRole] AS [p1] ON [p0].[Id] = [p1].[PerfilId]
LEFT JOIN [UsuarioRole] AS [u0] ON [t].[Id] = [u0].[UsuarioId]
ORDER BY [t].[Activo], [t].[Id], [p].[Id], [p0].[Id], [p1].[Id], [u0].[Id]

Expected:

SELECT *
FROM (
    SELECT *
    FROM [Usuario] AS [u]
    WHERE [u].[Borrado] <> CAST(1 AS bit)
    ORDER BY [u].[Activo]
   
) AS [t]
INNER JOIN [Provincias] AS [p] ON [t].[ProvinciaId] = [p].[Id]
INNER JOIN [Perfil] AS [p0] ON [t].[PerfilId] = [p0].[Id]
LEFT JOIN [PerfilRole] AS [p1] ON [p0].[Id] = [p1].[PerfilId]
LEFT JOIN [UsuarioRole] AS [u0] ON [t].[Id] = [u0].[UsuarioId]
ORDER BY [t].[Activo], [t].[Id], [p].[Id], [p0].[Id], [p1].[Id], [u0].[Id]
 OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY

I Can replicate what I want doing the following:

  List<Usuario> users = await _enlineaDbContext.Usuario
             .Where(u => !u.Borrado)
             .Where(filtro)//Properties custom
             .Include(u => u.Provincia)
             .Include(u => u.Perfil.PerfilRole)
                 .ThenInclude(pr => pr.Role)
             .Include(u => u.UsuarioRole)
                 .ThenInclude(ur => ur.Role)
             .OrderBy(u => u.Activo)
             .ToListAsync().ConfigureAwait(false);

            users = users.Skip(pageSize * (pageNumber - 1))
            .Take(pageSize).ToList();

But here we already made the fetch, and I don't want items than I'm not going to use.

Just want to use Take() and Skip() in the outside query.


Technical details
EF Core version: 3.1.2
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

4 participants