Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Find/Single operations with includes #22579

Open
joakimriedel opened this issue Sep 17, 2020 · 17 comments
Open

Optimize Find/Single operations with includes #22579

joakimriedel opened this issue Sep 17, 2020 · 17 comments

Comments

@joakimriedel
Copy link
Contributor

In 5.0 RC1 a warning log message is thrown for .SingleAsync() queries:

warn: Microsoft.EntityFrameworkCore.Query[10102]
      Query uses a row limiting operation (Skip/Take) without OrderBy, which may lead to unpredictable results.

I guess this has something to do with the fact that the actual SQL generated includes an automatic TOP(2) clause to determine if there are more than 1 item matching the query.

In 3.1 there were no warnings logged. I prefer no warnings logged.

@smitpatel
Copy link
Contributor

smitpatel commented Sep 17, 2020

The warning is logged because doing Single on a query without using Where and OrderBy, which means that the query is going to get 1 record from server which can change in future iteration of query. This warning existed in 2.x release and was absent in 3.1. We have just reintroduced what was missing. To avoid warning, add ~Where/~OrderBy to the query before calling Single.

@joakimriedel
Copy link
Contributor Author

This particular query just picks the current user by primary key, hence no Where/OrderBy.

.SingleAsync(u => u.Id == currentUserId)

Meaning I want the single user by Id, but I want exceptions thrown if:

  1. the user is missing
    or
  2. for some reason the primary key is no longer unique and two users exist with same primary key in the database

I don't see why I should add Where/OrderBy in this case?

@smitpatel
Copy link
Contributor

Error on my part that it should not logged with Where. It does get logged in the absence of OrderBy when translating Skip or Take.
SingleAsync does not use Skip/Take pipeline directly, can you share full query?

@joakimriedel
Copy link
Contributor Author

Sure

// get a hold on the current user with metadata
                _user = await _dbContext.Users
                        .TagWith("CurrentRequest - GetUserAsync")
                        .Include(u => u.Currency)
                        .Include(u => u.Verifications)
                        .Include(u => u.NotificationSettings)
                        .Include(u => u.UserConnections)
                            .ThenInclude(uc => uc.BillingGroup)
                            .ThenInclude(uc => uc.Verifications)
                        .AsSplitQuery()
                        .SingleAsync(u => u.Id == currentUserId);

@smitpatel
Copy link
Contributor

smitpatel commented Sep 17, 2020

Additional queries for Include has to do Take(1) to only get correlated data for first result they don't have OrderBy so result could be non-deterministic. For Single we may be able to identify that the predicate is on PK so there would be only one match, it is hard to generalize it for any Where/Take combination. Queries can have multiple rows with same PK when doing joins and other complex scenarios. This was probably happening in 2.1 also.

@joakimriedel
Copy link
Contributor Author

Still trying to wrap my head around this. From a user perspective, something feels wrong since I expect to get the single user with all included navigation properties without warnings or having to add OrderBy.

I also don't really understand why the Include have to do Take(1) ? I thought it would just match by the predicate given, joining any related items.

But technical reasons aside, what is the recommended way to retrieve a single tracked item from the database by primary key having additional navigation properties eagerly loaded?

@smitpatel
Copy link
Contributor

I also don't really understand why the Include have to do Take(1) ? I thought it would just match by the predicate given, joining any related items.

It does predicate match but it needs to figure out which items from parent it needs to match in the predicate joining to Children. If only 1 (or limited number of parents) are selected then we have to restrict the set of parents to same amount in additional queries so that predicate only matches related data for that limited set rather than generating join for everything.
You can suppress the warning or add OrderBy before First.

We will discuss in the meeting "recommended way".

@joakimriedel
Copy link
Contributor Author

Thanks. I'll try to suppress it for now, don't want any possible overhead of OrderBy since this is on a critical application path. How do I suppress a single warning? I do want warnings for other Single-query abuses, just not this particular one.

@smitpatel
Copy link
Contributor

Actually there will be OrderBy always because of Include so that we get records in a deterministic order to create proper buckets. It can be optimized for Single case but currently that is not happening.

@joakimriedel
Copy link
Contributor Author

Did a bit of debugging this morning and it adds an extra ORDER BY in the inner clause. But the sql query planner didn't seem to add much overhead from this, perhaps it's smart enough to optimize out the ORDER BY since the WHERE clause is on primary key?

query.OrderBy(u => u.Id).SingleAsync(u => u.Id == currentUserId)

(no warning)

SELECT [t].[Id], [t].[...]
FROM (
    SELECT TOP(2) [a].[Id], [a].[...], [c].[Id] AS [Id0], [c].[...]
    FROM [AspNetUsers] AS [a]
    INNER JOIN [Currencies] AS [c] ON [a].[CurrencyId] = [c].[Id]
    WHERE [a].[Id] = @__currentUserId_0
    ORDER BY [a].[Id]
) AS [t]
ORDER BY [t].[Id], [t].[Id0]

query.SingleAsync(u => u.Id == currentUserId)

(warning)

SELECT [t].[Id], [t].[...]
FROM (
    SELECT TOP(2) [a].[Id], [a].[...], [c].[Id] AS [Id0], [c].[...]
    FROM [AspNetUsers] AS [a]
    INNER JOIN [Currencies] AS [c] ON [a].[CurrencyId] = [c].[Id]
    WHERE [a].[Id] = @__currentUserId_0
) AS [t]
ORDER BY [t].[Id], [t].[Id0]

I'll keep the OrderBy for now until your meeting to see if there's a better way. At least I do not get any runtime warnings now anymore.

@ajcvickers
Copy link
Member

Moving to the backlog to attempt to not generate this warning when it is not needed.

@ajcvickers ajcvickers added this to the Backlog milestone Sep 18, 2020
@smitpatel smitpatel changed the title 5.0 RC1 .SingleAsync() throws warning about row limiting operation Optimize Find/Single operations with includes Sep 18, 2020
@joakimriedel
Copy link
Contributor Author

For your info - I also did some basic StopWatch-benchmarking of the SingleAsync query with Include compared to FindAsync followed by LoadAsync, but the SingleAsync still performed slightly better (~38 ms compared to ~43 ms) even with the ORDER BY.

An optimized FindAsync with Include would be handy for querying out single items and related data, looking forward to that!

@mqudsi
Copy link

mqudsi commented Feb 24, 2021

I just want to add that aside from having a warning where it's preferable to see none, this also makes it harder/near impossible to track down actual cases where non-deterministic results are possible because of, e.g., .FirstAsync() when the logs are full of the false positives.

@joakimriedel
Copy link
Contributor Author

@ajcvickers friendly bump - would this be considered a candidate to move from backlog anytime soon? as you see in previous comment from 2021, there's at least 22 other souls looking for this.

@ajcvickers
Copy link
Member

@joakimriedel This issue is currently at position 131 in the list of most voted issues. Not sure why people are doing 👍 on the comment without voting for the issue, but even if it had 23 votes, it would still only be at around position 62.

@joakimriedel
Copy link
Contributor Author

Ok thanks, did not know it was such a strict popularity vote to get out of backlog. I'll see what I can do to promote it 😊

@ajcvickers
Copy link
Member

@joakimriedel It's not a strict popularity vote at all--see the planning process. But cases made on popularity need to be pretty high up the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants