Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EF Core 7 GroupBy with Owned Entity perf Regression #29593

Open
Tracked by #30173
martosource opened this issue Nov 17, 2022 · 13 comments
Open
Tracked by #30173

EF Core 7 GroupBy with Owned Entity perf Regression #29593

martosource opened this issue Nov 17, 2022 · 13 comments
Assignees
Labels
area-groupby area-perf area-query consider-for-current-release customer-reported punted-for-8.0 Originally planned for the EF Core 8.0 (EF8) release, but moved out due to resource constraints. type-bug
Milestone

Comments

@martosource
Copy link

EF Core 7 for SQL Server has a SQL query translation regression from EF Core 6 for GroupBy with aggregate functions.

For the query:

var query = context.Items
    .GroupBy(x => x.Group)
    .Select(x => new
    {
        Set1Value1 = x.Sum(v => v.Set1.Value1),
        Set1Value2 = x.Sum(v => v.Set1.Value2),
        Set2Value1 = x.Sum(v => v.Set2.Value1),
        Set2Value2 = x.Sum(v => v.Set2.Value2),
    });

SQL generated from EF Core 6:

SELECT COALESCE(SUM([i].[Set1_Value1]), 0) AS [Set1Value1], COALESCE(SUM([i].[Set1_Value2]), 0) AS [Set1Value2], COALESCE(SUM([i].[Set2_Value1]), 0) AS [Set2Value1], COALESCE(SUM([i].[Set2_Value2]), 0) AS [Set2Value2]
FROM [Items] AS [i]
GROUP BY [i].[Group]

SQL generated from EF Core 7.0.0

SELECT (
    SELECT COALESCE(SUM([i0].[Set1_Value1]), 0)
    FROM [Items] AS [i0]
    WHERE [i].[Group] = [i0].[Group] OR (([i].[Group] IS NULL) AND ([i0].[Group] IS NULL))) AS [Set1Value1], (
    SELECT COALESCE(SUM([i1].[Set1_Value2]), 0)
    FROM [Items] AS [i1]
    WHERE [i].[Group] = [i1].[Group] OR (([i].[Group] IS NULL) AND ([i1].[Group] IS NULL))) AS [Set1Value2], (
    SELECT COALESCE(SUM([i2].[Set2_Value1]), 0)
    FROM [Items] AS [i2]
    WHERE [i].[Group] = [i2].[Group] OR (([i].[Group] IS NULL) AND ([i2].[Group] IS NULL))) AS [Set2Value1], (
    SELECT COALESCE(SUM([i3].[Set2_Value2]), 0)
    FROM [Items] AS [i3]
    WHERE [i].[Group] = [i3].[Group] OR (([i].[Group] IS NULL) AND ([i3].[Group] IS NULL))) AS [Set2Value2]
FROM [Items] AS [i]
GROUP BY [i].[Group]

EF Core 7 is now adding sub queries to the SELECT for each of the SUM functions. I believe that the EF Core 6 SQL is better and more efficient.

I have created a repo https://github.com/martosource/EfCoreGroupByOwnedTest

@raffaele-cappelli
Copy link

I agree, I just noted the same problem. Many of my queries are 10x slower !

@maumar
Copy link
Contributor

maumar commented Nov 17, 2022

per conversation with Smit - this is the commit that introduced the change - 21bbb3f, specifically the removal of GroupByAggregateLiftingExpressionVisitor. With the design changes and new scenarios that were enabled, aggregate lifting became too complicated to do reliably

@martosource
Copy link
Author

@maumar does this means it won't or can't be fixed?

Using sub queries in the SELECT is a real performance killer.

@maumar maumar changed the title EF Core 7 GroupBy with Owned Entity Regression EF Core 7 GroupBy with Owned Entity perf Regression Nov 17, 2022
@maumar
Copy link
Contributor

maumar commented Nov 17, 2022

@martosource realistically the chances of fixing this (soon) are low. The proper fix is very hard and the person who had the most knowledge about the area no longer works in the team, which makes it even harder/riskier. I found a hacky workaround - test it out to see if it's acceptable for your case:

context.Items
           .Select(x => new {
               x.Group,
               x.Set1,
               x.Set2,
               Dummy = Guid.NewGuid()
           })
           .Distinct()
           .GroupBy(x => x.Group)
           .Select(x => new
           {
               Set1Value1 = x.Sum(v => v.Set1.Value1),
               Set1Value2 = x.Sum(v => v.Set1.Value2),
               Set2Value1 = x.Sum(v => v.Set2.Value1),
               Set2Value2 = x.Sum(v => v.Set2.Value2),
           })

produces:

SELECT COALESCE(SUM([t].[Set1_Value1]), 0) AS [Set1Value1], COALESCE(SUM([t].[Set1_Value2]), 0) AS [Set1Value2], COALESCE(SUM([t].[Set2_Value1]), 0) AS [Set2Value1], COALESCE(SUM([t].[Set2_Value2]), 0) AS [Set2Value2]
FROM (
    SELECT DISTINCT [i].[Group], [i].[Id], [i].[Set1_Value1], [i].[Set1_Value2], [i].[Set2_Value1], [i].[Set2_Value2], NEWID() AS [Dummy]
    FROM [Items] AS [i]
) AS [t]
GROUP BY [t].[Group]

The idea is to expand and project all the needed navigations before you do group by. EF is smart enough to counter act this, but for that purpose the workaround introduces Distinct - it causes a pushdown and "locks" the shape of projection, so EF engine can't do anything but accept the shape. Distinct can change the results (if there are any duplicates), so to counteract that, we introduce a dummy guid, that always guarantees the uniqueness. You can drop the dummy guid if you are sure the results are unique. But you also need to assess if the distinct slowdown is bad or not.

edit: actually, projecting the owner key instead of generating guid should be better/faster in almost every case

@martosource
Copy link
Author

@maumar thank you, I will give this a try.

@raffaele-cappelli
Copy link

@martosource realistically the chances of fixing this (soon) are low. The proper fix is very hard and the person who had the most knowledge about the area no longer works in the team, which makes it even harder/riskier. I found a hacky workaround - test is out to see if it's acceptable for your case:

Thank you @maumar for your quick reply and workaround. In my case I have too many queries like this to apply the workaround, and I can confirm the performance loss is large (about 10x slower in my application on SqlServer). I'll have to stick with EF6 until this bug is solved.

@superjulius
Copy link

Thank you @maumar for the workaround! It definitely sounds hacky but it does work with apparently the only GroupBy query impacted for us. Others are not impacted.

We don't want to revert to EF6, nor completely change our approach and since there is no ETA on a possible fix so we will go for it.

@danports
Copy link

danports commented Jan 6, 2023

I'm seeing what is probably the same issue on EF7 with GroupBy expressions that contain casts on navigation properties, e.g. GroupBy(x => (x.Pet as Dog).Name). In my case, the performance regression is so severe that query execution times out after 2 minutes, whereas on EF6 it completed in a fraction of a second. I'll try to adapt @maumar's workaround.

@danports
Copy link

danports commented Jan 6, 2023

The Select + Distinct workaround worked for me, and it doesn't appear that the DISTINCT added anything significant to the query execution time. Thanks @maumar!

@greengumby
Copy link

This is quite a serious regression and I find it surprising that so little attention has been paid to this. I will be reverting to EF6 until this issue is closed.

@martosource
Copy link
Author

@greengumby the SQL generated by EF Core 6 imho is better than EF Core 7.

@raffaele-cappelli
Copy link

@greengumby the SQL generated by EF Core 6 imho is better than EF Core 7.

As far as I'm concerned, in many of my queries EF7 performs better than EF6; I only had the problem described in this issue, but I have to say that once I understood this limitation, I took the opportunity to restructure the queries involved and they are now both more efficient and more readable. I'm not saying there isn't the problem, but at the end of the day, it wasn't that bad.

@ajcvickers ajcvickers added the punted-for-8.0 Originally planned for the EF Core 8.0 (EF8) release, but moved out due to resource constraints. label Jun 28, 2023
@ajcvickers ajcvickers modified the milestones: 8.0.0, Backlog Jun 29, 2023
@p3t3rix
Copy link

p3t3rix commented Apr 16, 2024

Any updates regarding this regression ? It seems to also exist in ef8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-groupby area-perf area-query consider-for-current-release customer-reported punted-for-8.0 Originally planned for the EF Core 8.0 (EF8) release, but moved out due to resource constraints. type-bug
Projects
None yet
Development

No branches or pull requests

10 participants