You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know that since EF Core 3.0 a query with multiple included collections (via .Include(x => x.my_collection)) is fetched via a single SQL query which can result in a huge result set containing duplicate information due to many join operations (i.e. cartesian explosion).
I know that since EF Core 5.0 we can use .AsSplitQuery() or even activate split queries globally. However as I understand it this poses potential risks and should only be used when necessary.
In my concrete case I want to retrieve a single object out of the database with .SingleOrDefault(x => x.id == ...). The object has many included collections, some are included via a .ThenInclude(...).
The C# looks like this (names changed):
For one rather bad case I get a result set back with ~40k rows and ~100 columns because of all the joins. The SQL query takes ~15 seconds.
The interesting thing is that upon benchmarking this query in pgAdmin I found that >95% of those 15 seconds seems to be used on the last line of the query, which is an ORDER BY.
The last line looks like this:
So my question is why is an ORDER BY in this query when my C# code does not call for it? (The C# code contains no ordering operation and also just wants a single element by id).
I can only guess this is for EF Core to easier translate the result of the query to CLR objects but when this helper blows up SQL query time ~twentyfold I doubt its worth it in this case.
Can I somehow prevent the generation of the ORDER BY? If not I will have to resort to split queries.
EF Core version: 5.0.5
Database provider: Npgsql 5.0.5.1
Database: PostgreSQL 12
Target framework: NET 5.0
Operating system: Windows 10
If this question belongs in the npgsql repository just give me a notice and I will create it there.
If something more is requested I will try to add it.
The text was updated successfully, but these errors were encountered:
desdoades
changed the title
EF Core creates SQL query with ORDER BY at the end massively impacting performance for cases of cartesian explosion
EF Core creates SQL query with ORDER BY at the end massively impacting performance for cases with many included collections
Dec 16, 2021
@desdoades the orderings are necessary for EF Core to properly load related entities (#19571), see also #18022. #19828 removed the last ordering for 6.0, but the others are necessary. You can indeed use split query for this; given the number of includes in your query above, that looks like a good idea even regardless of the ordering.
Thank you for the quick answer and excuse me for creating a duplicate. I used search on the issues but did not find it.
I will read your links and then use query splitting.
I know that since EF Core 3.0 a query with multiple included collections (via .Include(x => x.my_collection)) is fetched via a single SQL query which can result in a huge result set containing duplicate information due to many join operations (i.e. cartesian explosion).
I know that since EF Core 5.0 we can use .AsSplitQuery() or even activate split queries globally. However as I understand it this poses potential risks and should only be used when necessary.
In my concrete case I want to retrieve a single object out of the database with .SingleOrDefault(x => x.id == ...). The object has many included collections, some are included via a .ThenInclude(...).
The C# looks like this (names changed):
For one rather bad case I get a result set back with ~40k rows and ~100 columns because of all the joins. The SQL query takes ~15 seconds.
The interesting thing is that upon benchmarking this query in pgAdmin I found that >95% of those 15 seconds seems to be used on the last line of the query, which is an ORDER BY.
The last line looks like this:
So my question is why is an ORDER BY in this query when my C# code does not call for it? (The C# code contains no ordering operation and also just wants a single element by id).
I can only guess this is for EF Core to easier translate the result of the query to CLR objects but when this helper blows up SQL query time ~twentyfold I doubt its worth it in this case.
Can I somehow prevent the generation of the ORDER BY? If not I will have to resort to split queries.
EF Core version: 5.0.5
Database provider: Npgsql 5.0.5.1
Database: PostgreSQL 12
Target framework: NET 5.0
Operating system: Windows 10
If this question belongs in the npgsql repository just give me a notice and I will create it there.
If something more is requested I will try to add it.
The text was updated successfully, but these errors were encountered: