Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StackOverflowException in nested GroupBy query #27094

Closed
bachratyg opened this issue Jan 3, 2022 · 6 comments
Closed

StackOverflowException in nested GroupBy query #27094

bachratyg opened this issue Jan 3, 2022 · 6 comments
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Milestone

Comments

@bachratyg
Copy link

bachratyg commented Jan 3, 2022

A StackOverflow occurs when a nested group by references an expression from an outer group by. This is a regression, it used to work in EF Core 5.

Repro

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="6.0.1" />
  </ItemGroup>
</Project>
class Program
{
	public class Table
	{
		public int Id { get; set; }
		public int? Value { get; set; }
	}
	public class TestDb : DbContext
	{
		public DbSet<Table> Table { get; set; }
		protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
		{
			optionsBuilder.UseSqlServer("data source=dummy");
		}
	}
	static void Main()
	{
		using var db = new TestDb();
		var q = from t in db.Table
			group t.Id by t.Value into tg
			select new
			{
				A = tg.Key,
				B = tg.Sum(),
				C = (from t in db.Table
					 group t.Id by t.Value into tg2
					 select tg.Sum() + tg2.Sum()
					 ).FirstOrDefault()
			};
		var sql = q.ToQueryString();
	}
}

Expected translation (approx):

SELECT
    t1.Value AS A,
    SUM(t1.Id) AS B,
    (
        SELECT SUM(t1.Id) + SUM(t2.Id)
        FROM Table t2
        GROUP BY t2.Value
    ) AS C
FROM Table t1
GROUP BY t1.Value

Actual result:

Stack overflow.
Repeat 16011 times:
--------------------------------
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression+GroupByAggregateLiftingExpressionVisitor.Visit(System.Linq.Expressions.Expression)
--------------------------------
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.ScalarSubqueryExpression.VisitChildren(System.Linq.Expressions.ExpressionVisitor)
   at System.Linq.Expressions.ExpressionVisitor.VisitExtension(System.Linq.Expressions.Expression)
   at System.Linq.Expressions.Expression.Accept(System.Linq.Expressions.ExpressionVisitor)
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SqlFunctionExpression.VisitChildren(System.Linq.Expressions.ExpressionVisitor)
   at System.Linq.Expressions.ExpressionVisitor.VisitExtension(System.Linq.Expressions.Expression)
   at System.Linq.Expressions.Expression.Accept(System.Linq.Expressions.ExpressionVisitor)
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.TryLiftGroupByAggregate(Microsoft.EntityFrameworkCore.Query.SqlExpressions.SqlExpression)
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.AddToProjection(Microsoft.EntityFrameworkCore.Query.SqlExpressions.SqlExpression, System.String, Boolean)
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplyProjection(System.Linq.Expressions.Expression, Microsoft.EntityFrameworkCore.Query.ResultCardinality, Microsoft.EntityFrameworkCore.QuerySplittingBehavior)
   at Microsoft.EntityFrameworkCore.Query.Internal.SelectExpressionProjectionApplyingExpressionVisitor.VisitExtension(System.Linq.Expressions.Expression)
   at System.Linq.Expressions.Expression.Accept(System.Linq.Expressions.ExpressionVisitor)
   at System.Linq.Expressions.ExpressionVisitor.Visit(System.Linq.Expressions.Expression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryTranslationPostprocessor.Process(System.Linq.Expressions.Expression)
   at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.CreateQueryExecutor[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Linq.Expressions.Expression)
   at Microsoft.EntityFrameworkCore.Storage.Database.CompileQuery[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Linq.Expressions.Expression, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileQueryCore[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Microsoft.EntityFrameworkCore.Storage.IDatabase, System.Linq.Expressions.Expression, Microsoft.EntityFrameworkCore.Metadata.IModel, Boolean)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler+<>c__DisplayClass9_0`1[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].<Execute>b__0()
   at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Object, System.Func`1<System.Func`2<Microsoft.EntityFrameworkCore.Query.QueryContext,System.__Canon>>)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.Execute[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Linq.Expressions.Expression)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.Execute[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Linq.Expressions.Expression)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ToQueryString(System.Linq.IQueryable)
   at Program.Main()

provider and version information

EF Core version: 6.0.1
Database provider: Microsoft.EntityFrameworkCore.SqlServer 6.0.1
Target framework: .NET 6.0
Operating system: Windows 10 21H1
IDE: Visual Studio 2022 17.0.4

@smitpatel
Copy link
Member

For triage

SELECT [t].[Value] AS [A], COALESCE(SUM([t].[Id]), 0) AS [B], COALESCE((
    SELECT TOP(1) (
        SELECT COALESCE(SUM([t1].[Id]), 0)
        FROM [Table] AS [t1]
        WHERE ([t].[Value] = [t1].[Value]) OR ([t].[Value] IS NULL AND [t1].[Value] IS NULL)) + COALESCE(SUM([t0].[Id]), 0)
    FROM [Table] AS [t0]
    GROUP BY [t0].[Value]
    ORDER BY (SELECT 1)), 0) AS [C]
FROM [Table] AS [t]
GROUP BY [t].[Value]

@bachratyg
Copy link
Author

Looks like this would have the same perf issue as #27102 (comment). Is this a limitation of the current architecture?

@smitpatel
Copy link
Member

@bachratyg - If you are talking about this part ([t].[Value] = [t1].[Value]) OR ([t].[Value] IS NULL AND [t1].[Value] IS NULL) then both columns are nullable so comparing them to be null at same time is required to generate same results as C#. It doesn't compute anything complex like #27102. This is not limitation it is by design. Please read https://docs.microsoft.com/ef/core/querying/null-comparisons

@bachratyg
Copy link
Author

No, my issue is not with c# null behavior. I'd expect that would get translated as [t].[Value] = [t1].[Value] with UseRelationalNulls(true).

My problem is the proposed translation references the table 3 times instead of the necessary 2. I've checked the execution plan in SSMS and that shows that this would hit the table 5 times whereas the translation I would expect (and something similar was generated by EFC5) only hits it 2 times. Considering this is an oversimplified query on an empty table I'd expect actual perf would be worse if the base query is more complex.

@smitpatel
Copy link
Member

The non-simplified form of the first aggregate is limitation of the fix in patch branch. It will be translated efficiently in 7.0

@bachratyg
Copy link
Author

Understood. Then I will have to do some perf regression testing as we're moving from EFC5 to EFC6. Can you point me to an issue where I can track work on this?

smitpatel added a commit that referenced this issue Jan 6, 2022
Correlate the scalar subquery with parent SelectExpression

Resolves #27094
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jan 6, 2022
smitpatel added a commit that referenced this issue Jan 7, 2022
Correlate the scalar subquery with parent SelectExpression

Resolves #27094
@ajcvickers ajcvickers modified the milestones: 6.0.x, 6.0.2 Jan 10, 2022
smitpatel added a commit that referenced this issue Jan 10, 2022
Correlate the scalar subquery with parent SelectExpression

Resolves #27094
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Projects
None yet
Development

No branches or pull requests

3 participants