Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy with outer reference not translating properly, causing data loss #27102

Closed
bachratyg opened this issue Jan 3, 2022 · 3 comments
Closed
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Milestone

Comments

@bachratyg
Copy link

bachratyg commented Jan 3, 2022

Some query logic is optimized away in the following query. No exception is thrown, the query executes successfully, returns the correct shape but bad data. This is a regression, it used to work in EF Core 5.
Possibly related to #27094.

Repro

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="6.0.1" />
  </ItemGroup>
</Project>
class Program
{
	public class Table
	{
		public int Id { get; set; }
		public int? Value { get; set; }
	}
	public class TestDb : DbContext
	{
		public DbSet<Table> Table { get; set; }
		protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
		{
			optionsBuilder.UseSqlServer("data source=dummy");
		}
	}
	static void Main()
	{
		using var db = new TestDb();
		var q = from t in db.Table
				group t.Id by t.Value into tg
				select new
				{
					A = tg.Key,
					B = db.Table.Where(t => t.Value == tg.Max() * 6).Max(t => (int?)t.Id),
				};
		var sql = q.ToQueryString();
	}
}

Expected translation (approx):

SELECT m0.Value AS A, (
    SELECT MAX(m.Id)
    FROM Table AS m
    WHERE m.Value == (MAX(m0.Id) * 6)) AS B
FROM Table AS m0
GROUP BY m0.Value

Actual translation:

SELECT m.Value AS A, MAX(m.Id) AS B
FROM Table AS m
GROUP BY m.Value

provider and version information

EF Core version: 6.0.1
Database provider: Microsoft.EntityFrameworkCore.SqlServer 6.0.1
Target framework: .NET 6.0
Operating system: Windows 10 21H1
IDE: Visual Studio 2022 17.0.4

@ajcvickers ajcvickers added this to the 6.0.x milestone Jan 4, 2022
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jan 5, 2022
smitpatel added a commit that referenced this issue Jan 5, 2022
Earlier, we didn't match on exact predicate

Also fix bug in LikeExpression.Equals

Resolves #27102
@smitpatel
Copy link
Contributor

Generated SQL after fix

SELECT [t].[Value] AS [A], (
    SELECT MAX([t0].[Id])
    FROM [Table] AS [t0]
    WHERE ([t0].[Value] = ((
        SELECT MAX([t1].[Id])
        FROM [Table] AS [t1]
        WHERE ([t].[Value] = [t1].[Value]) OR ([t].[Value] IS NULL AND [t1].[Value] IS NULL)) * 6)) OR ([t0].[Value] IS NULL AND (
        SELECT MAX([t1].[Id])
        FROM [Table] AS [t1]
        WHERE ([t].[Value] = [t1].[Value]) OR ([t].[Value] IS NULL AND [t1].[Value] IS NULL)) IS NULL)) AS [B]
FROM [Table] AS [t]
GROUP BY [t].[Value]

@bachratyg
Copy link
Author

bachratyg commented Jan 5, 2022

Is the query complexity due to UseRelationalNulls(false)?

This may have some perf issues since the base query is repeated in the subquery.

Edit/clarification:
by perf issue I didn't mean the extra code to compensate for c# null semantics but that the subquery is hitting the table more times than necessary. With db null semantics I'd expect this would be the translation:

SELECT [t].[Value] AS [A], (
    SELECT MAX([t0].[Id])
    FROM [Table] AS [t0]
    WHERE ([t0].[Value] = (
        SELECT MAX([t1].[Id])
        FROM [Table] AS [t1]
        WHERE [t].[Value] = [t1].[Value]) * 6)) AS [B]
FROM [Table] AS [t]
GROUP BY [t].[Value]

@smitpatel
Copy link
Contributor

Value and the result of max both can be null hence we need additional comparison to match results to C#.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Projects
None yet
Development

No branches or pull requests

3 participants