Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid duplicated subquery when translating LINQ intermediate projections #27285

Closed
roji opened this issue Jan 26, 2022 · 11 comments
Closed

Avoid duplicated subquery when translating LINQ intermediate projections #27285

roji opened this issue Jan 26, 2022 · 11 comments

Comments

@roji
Copy link
Member

roji commented Jan 26, 2022

Given the following query:

var q1 = ctx
    .Blogs
    .Select(p => new
    {
        p.Name,
        PostCount = p.Posts.Count
    })
    .Where(p => p.PostCount > 5)
    .ToList();

... we currently generate the following SQL:

SELECT [b].[Name], (
    SELECT COUNT(*)
    FROM [Post] AS [p0]
    WHERE [b].[Id] = [p0].[BlogId]) AS [PostCount]
FROM [Blogs] AS [b]
WHERE (
    SELECT COUNT(*)
    FROM [Post] AS [p]
    WHERE [b].[Id] = [p].[BlogId]) > 5;

Note that the LINQ Select gets duplicated twice as a subquery in the SQL. Rather than duplicating, the LINQ intermediate projection can be applied as a subquery:

SELECT [b].[Name], [b].[PostCount]
FROM (
    SELECT [b].[Name], (
        SELECT COUNT(*)
        FROM [Post] AS [p0]
        WHERE [b].[Id] = [p0].[BlogId]) AS [PostCount]
    FROM [Blogs] AS [b]
) AS [b]
WHERE [Postcount] > 1;

... allowing SQL Server to evaluate the subquery just once. The SQL Server plan TotalSubtreeCost for the former query is 1.63, whereas the WITH-based alternative is 1.00, which is very significant.

Full plan details

Current query with duplicated subquery

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+--------------------+--------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+------------+----------+------------+----------+----------------+--------------------------------+--------+--------+--------+------------------+
|StmtText                                                                                                                                                                                                                         |StmtId|NodeId|Parent|PhysicalOp          |LogicalOp           |Argument                                                                                                                        |DefinedValues                                                       |EstimateRows|EstimateIO|EstimateCPU |AvgRowSize|TotalSubtreeCost|OutputList                      |Warnings|Type    |Parallel|EstimateExecutions|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+--------------------+--------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+------------+----------+------------+----------+----------------+--------------------------------+--------+--------+--------+------------------+
|SELECT [b].[Name], (                                                                                                                                                                                                             |1     |1     |0     |NULL                |NULL                |1                                                                                                                               |NULL                                                                |31906.955   |NULL      |NULL        |NULL      |1.6382366       |NULL                            |NULL    |SELECT  |false   |NULL              |
|SELECT COUNT(*)                                                                                                                                                                                                                  |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|FROM [Post] AS [p0]                                                                                                                                                                                                              |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|WHERE [b].[Id] = [p0].[BlogId]) AS [PostCount]                                                                                                                                                                                   |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|FROM [Blogs] AS [b]                                                                                                                                                                                                              |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|WHERE (                                                                                                                                                                                                                          |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|SELECT COUNT(*)                                                                                                                                                                                                                  |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|FROM [Post] AS [p]                                                                                                                                                                                                               |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|WHERE [b].[Id] = [p].[BlogId]) > 1                                                                                                                                                                                               |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                                |        |        |        |                  |
|  |--Compute Scalar(DEFINE:([Expr1007]=CASE WHEN [Expr1005] IS NULL THEN (0) ELSE [Expr1005] END))                                                                                                                               |1     |2     |1     |Compute Scalar      |Compute Scalar      |DEFINE:([Expr1007]=CASE WHEN [Expr1005] IS NULL THEN (0) ELSE [Expr1005] END)                                                   |[Expr1007]=CASE WHEN [Expr1005] IS NULL THEN (0) ELSE [Expr1005] END|31906.955   |0         |0.0031906955|4039      |1.6382366       |[b].[Name], [Expr1007]          |NULL    |PLAN_ROW|false   |1                 |
|       |--Merge Join(Right Outer Join, MERGE:([p0].[BlogId])=([b].[Id]), RESIDUAL:([test].[dbo].[Blogs].[Id] as [b].[Id]=[test].[dbo].[Post].[BlogId] as [p0].[BlogId]))                                                         |1     |3     |2     |Merge Join          |Right Outer Join    |MERGE:([p0].[BlogId])=([b].[Id]), RESIDUAL:([test].[dbo].[Blogs].[Id] as [b].[Id]=[test].[dbo].[Post].[BlogId] as [p0].[BlogId])|NULL                                                                |31906.955   |0         |0.1807932   |4039      |1.6350459       |[b].[Name], [Expr1005]          |NULL    |PLAN_ROW|false   |1                 |
|            |--Compute Scalar(DEFINE:([Expr1005]=CONVERT_IMPLICIT(int,[Expr1012],0)))                                                                                                                                            |1     |4     |3     |Compute Scalar      |Compute Scalar      |DEFINE:([Expr1005]=CONVERT_IMPLICIT(int,[Expr1012],0))                                                                          |[Expr1005]=CONVERT_IMPLICIT(int,[Expr1012],0)                       |49999       |0         |0           |15        |0.4545744       |[p0].[BlogId], [Expr1005]       |NULL    |PLAN_ROW|false   |1                 |
|            |    |--Stream Aggregate(GROUP BY:([p0].[BlogId]) DEFINE:([Expr1012]=Count(*)))                                                                                                                                      |1     |5     |4     |Stream Aggregate    |Aggregate           |GROUP BY:([p0].[BlogId])                                                                                                        |[Expr1012]=Count(*)                                                 |49999       |0         |0.0849983   |15        |0.4545744       |[p0].[BlogId], [Expr1012]       |NULL    |PLAN_ROW|false   |1                 |
|            |         |--Index Scan(OBJECT:([test].[dbo].[Post].[IX_Post_BlogId] AS [p0]), ORDERED FORWARD)                                                                                                                      |1     |6     |5     |Index Scan          |Index Scan          |OBJECT:([test].[dbo].[Post].[IX_Post_BlogId] AS [p0]), ORDERED FORWARD                                                          |[p0].[BlogId]                                                       |99998       |0.2594213 |0.1101548   |11        |0.3695761       |[p0].[BlogId]                   |NULL    |PLAN_ROW|false   |1                 |
|            |--Filter(WHERE:(CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END>(1)))                                                                                                                                     |1     |7     |3     |Filter              |Filter              |WHERE:(CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END>(1))                                                           |NULL                                                                |31906.955   |0         |0.04399912  |4039      |0.99967533      |[b].[Id], [b].[Name]            |NULL    |PLAN_ROW|false   |1                 |
|                 |--Merge Join(Right Outer Join, MERGE:([p].[BlogId])=([b].[Id]), RESIDUAL:([test].[dbo].[Blogs].[Id] as [b].[Id]=[test].[dbo].[Post].[BlogId] as [p].[BlogId]))                                                 |1     |8     |7     |Merge Join          |Right Outer Join    |MERGE:([p].[BlogId])=([b].[Id]), RESIDUAL:([test].[dbo].[Blogs].[Id] as [b].[Id]=[test].[dbo].[Post].[BlogId] as [p].[BlogId])  |NULL                                                                |49999       |0         |0.2205957   |4043      |0.9556762       |[b].[Id], [b].[Name], [Expr1002]|NULL    |PLAN_ROW|false   |1                 |
|                      |--Compute Scalar(DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1013],0)))                                                                                                                                  |1     |9     |8     |Compute Scalar      |Compute Scalar      |DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1013],0))                                                                          |[Expr1002]=CONVERT_IMPLICIT(int,[Expr1013],0)                       |49999       |0         |0           |15        |0.4545744       |[p].[BlogId], [Expr1002]        |NULL    |PLAN_ROW|false   |1                 |
|                      |    |--Stream Aggregate(GROUP BY:([p].[BlogId]) DEFINE:([Expr1013]=Count(*)))                                                                                                                             |1     |10    |9     |Stream Aggregate    |Aggregate           |GROUP BY:([p].[BlogId])                                                                                                         |[Expr1013]=Count(*)                                                 |49999       |0         |0.0849983   |15        |0.4545744       |[p].[BlogId], [Expr1013]        |NULL    |PLAN_ROW|false   |1                 |
|                      |         |--Index Scan(OBJECT:([test].[dbo].[Post].[IX_Post_BlogId] AS [p]), ORDERED FORWARD)                                                                                                             |1     |11    |10    |Index Scan          |Index Scan          |OBJECT:([test].[dbo].[Post].[IX_Post_BlogId] AS [p]), ORDERED FORWARD                                                           |[p].[BlogId]                                                        |99998       |0.2594213 |0.1101548   |11        |0.3695761       |[p].[BlogId]                    |NULL    |PLAN_ROW|false   |1                 |
|                      |--Clustered Index Scan(OBJECT:([test].[dbo].[Blogs].[PK_Blogs] AS [b]), ORDERED FORWARD)                                                                                                                  |1     |12    |8     |Clustered Index Scan|Clustered Index Scan|OBJECT:([test].[dbo].[Blogs].[PK_Blogs] AS [b]), ORDERED FORWARD                                                                |[b].[Id], [b].[Name]                                                |49999       |0.22534722|0.0551559   |4039      |0.28050312      |[b].[Id], [b].[Name]            |NULL    |PLAN_ROW|false   |1                 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+--------------------+--------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+------------+----------+------------+----------+----------------+--------------------------------+--------+--------+--------+------------------+

Proposed single subquery variant

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+--------------------+--------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+------------+----------+------------+----------+----------------+-------------------------+--------+--------+--------+------------------+
|StmtText                                                                                                                                                                                                                            |StmtId|NodeId|Parent|PhysicalOp          |LogicalOp           |Argument                                                                                                                        |DefinedValues                                                       |EstimateRows|EstimateIO|EstimateCPU |AvgRowSize|TotalSubtreeCost|OutputList               |Warnings|Type    |Parallel|EstimateExecutions|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+--------------------+--------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+------------+----------+------------+----------+----------------+-------------------------+--------+--------+--------+------------------+
|SELECT [b].[Name], [b].[PostCount]                                                                                                                                                                                                  |1     |1     |0     |NULL                |NULL                |1                                                                                                                               |NULL                                                                |31906.955   |NULL      |NULL        |NULL      |1.002866        |NULL                     |NULL    |SELECT  |false   |NULL              |
|FROM (                                                                                                                                                                                                                              |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|SELECT [b].[Name], (                                                                                                                                                                                                                |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|SELECT COUNT(*)                                                                                                                                                                                                                     |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|FROM [Post] AS [p0]                                                                                                                                                                                                                 |      |      |      |                    |                    |                              |--Filter(WHERE:(CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END>(1)))                                                                                                         |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|WHERE [b].[Id] = [p0].[BlogId]) AS [PostCount]                                                                                                                                                                                      |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|FROM [Blogs] AS [b]                                                                                                                                                                                                                 |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|) AS [b]                                                                                                                                                                                                                            |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|WHERE [Postcount] > 1                                                                                                                                                                                                               |      |      |      |                    |                    |                                                                                                                                |                                                                    |            |          |            |          |                |                         |        |        |        |                  |
|  |--Compute Scalar(DEFINE:([Expr1004]=CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END))                                                                                                                                  |1     |2     |1     |Compute Scalar      |Compute Scalar      |DEFINE:([Expr1004]=CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END)                                                   |[Expr1004]=CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END|31906.955   |0         |0.0031906955|4039      |1.002866        |[b].[Name], [Expr1004]   |NULL    |PLAN_ROW|false   |1                 |
|       |--Filter(WHERE:(CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END>(1)))                                                                                                                                             |1     |3     |2     |Filter              |Filter              |WHERE:(CASE WHEN [Expr1002] IS NULL THEN (0) ELSE [Expr1002] END>(1))                                                           |NULL                                                                |31906.955   |0         |0.04399912  |4039      |0.99967533      |[b].[Name], [Expr1002]   |NULL    |PLAN_ROW|false   |1                 |
|            |--Merge Join(Right Outer Join, MERGE:([p0].[BlogId])=([b].[Id]), RESIDUAL:([test].[dbo].[Blogs].[Id] as [b].[Id]=[test].[dbo].[Post].[BlogId] as [p0].[BlogId]))                                                       |1     |4     |3     |Merge Join          |Right Outer Join    |MERGE:([p0].[BlogId])=([b].[Id]), RESIDUAL:([test].[dbo].[Blogs].[Id] as [b].[Id]=[test].[dbo].[Post].[BlogId] as [p0].[BlogId])|NULL                                                                |49999       |0         |0.2205957   |4039      |0.9556762       |[b].[Name], [Expr1002]   |NULL    |PLAN_ROW|false   |1                 |
|                 |--Compute Scalar(DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1007],0)))                                                                                                                                          |1     |5     |4     |Compute Scalar      |Compute Scalar      |DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1007],0))                                                                          |[Expr1002]=CONVERT_IMPLICIT(int,[Expr1007],0)                       |49999       |0         |0           |15        |0.4545744       |[p0].[BlogId], [Expr1002]|NULL    |PLAN_ROW|false   |1                 |
|                 |    |--Stream Aggregate(GROUP BY:([p0].[BlogId]) DEFINE:([Expr1007]=Count(*)))                                                                                                                                    |1     |6     |5     |Stream Aggregate    |Aggregate           |GROUP BY:([p0].[BlogId])                                                                                                        |[Expr1007]=Count(*)                                                 |49999       |0         |0.0849983   |15        |0.4545744       |[p0].[BlogId], [Expr1007]|NULL    |PLAN_ROW|false   |1                 |
|                 |         |--Index Scan(OBJECT:([test].[dbo].[Post].[IX_Post_BlogId] AS [p0]), ORDERED FORWARD)                                                                                                                    |1     |7     |6     |Index Scan          |Index Scan          |OBJECT:([test].[dbo].[Post].[IX_Post_BlogId] AS [p0]), ORDERED FORWARD                                                          |[p0].[BlogId]                                                       |99998       |0.2594213 |0.1101548   |11        |0.3695761       |[p0].[BlogId]            |NULL    |PLAN_ROW|false   |1                 |
|                 |--Clustered Index Scan(OBJECT:([test].[dbo].[Blogs].[PK_Blogs] AS [b]), ORDERED FORWARD)                                                                                                                          |1     |8     |4     |Clustered Index Scan|Clustered Index Scan|OBJECT:([test].[dbo].[Blogs].[PK_Blogs] AS [b]), ORDERED FORWARD                                                                |[b].[Id], [b].[Name]                                                |49999       |0.22534722|0.0551559   |4039      |0.28050312      |[b].[Id], [b].[Name]     |NULL    |PLAN_ROW|false   |1                 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+--------------------+--------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------+------------+----------+------------+----------+----------------+-------------------------+--------+--------+--------+------------------+

/cc @smitpatel maybe good to keep this in mind for your upcoming pending select work.

Note: exposing WITH as a user-facing API is tracked by #26486; this is different (and seems more important).

@roji roji changed the title Use WITH to optimize intermediate projections Avoid duplicated subquery when translating LINQ intermediate projections Jan 26, 2022
@roji
Copy link
Member Author

roji commented Jan 26, 2022

Note that if the same subquery ends up getting referenced twice in SQL (e.g. multiple joins to the same subquery), a WITH expression can be used to extract that out and avoid duplication.

@smitpatel
Copy link
Contributor

Duplicate of #7776

@smitpatel smitpatel marked this as a duplicate of #7776 Jan 26, 2022
@roji
Copy link
Member Author

roji commented Jan 26, 2022

@smitpatel does this get unblocked (or even implemented) as part of the pending selector work you have planned?

@smitpatel
Copy link
Contributor

It will automatically work after pending selector work. I plan to add tests for various scenarios from linked issues on #7776

@roji
Copy link
Member Author

roji commented Jan 26, 2022

That's great news, thanks!

@InspiringCode
Copy link

I would have needed this a few times in my projects. @smitpatel Is there any estimation when this might be implemented?

As a workaround one could use the AsSubquery() extension method of the Thinktecture NuGet package, but I would really prefer to have this common case handled by EF Core itself.

@roji
Copy link
Member Author

roji commented Feb 5, 2024

@InspiringCode we're looking into improving this, but it's quite a bit architectural change that can't be done lightly.

Re AsSubquery(), that's generally not a direction we want to go in - LINQ is about expressing your queries via .NET code, and not via SQL concepts (like "subquery"). At the end of the day is this an internal problem in how EF translates LINQ queries, so it should be fixed inside EF rather than forcing the user to work around it via additional operators.

@InspiringCode
Copy link

@roji Is there any workaround to achieve this behavior at the time? AsSubQuery is a possible workaround for SQL Server but I am currently working on a PostgreSQL project and the Thinktecture library only support SQL Server 😞. Is there any simple way to implement something like AsSubQuery for Npgsql as an extension method or the like? I have quite a few sub-select properties on which a also have to filter on and running them twice (as in the currently generated SQL) is horribly inefficient.

@roji
Copy link
Member Author

roji commented Jun 4, 2024

@InspiringCode whatever AsSubquery is doing should be possible for the PG provider as well - both are EF relational providers. Beyond that, I'm not aware of a workaround at this point.

@InspiringCode
Copy link

@roji It implements some kind of visitor and calls PushdownToSubquery (I don't remember the exact name) if it encounters a CallExpression with the AsSubQuery method (which is inserted in the expression tree by the AsSubQuery method). But requires quite some classes and infrastructure and is not a trivial thing.

Is there any easier way to plug into the expression tree processing or call PushdownToSubquery in another way?

@roji
Copy link
Member Author

roji commented Jun 4, 2024

@InspiringCode unfortunately not, there's no magic solution here. But I do think what AsSubquery is doing should pretty much work as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants