Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Server: Rewrite query to support aggregate functions over aggregates/subqueries #34256

Closed
Tracked by #30173
roji opened this issue Jul 20, 2024 · 1 comment · Fixed by #34262
Closed
Tracked by #30173

SQL Server: Rewrite query to support aggregate functions over aggregates/subqueries #34256

roji opened this issue Jul 20, 2024 · 1 comment · Fixed by #34262
Assignees
Labels
area-query area-sqlserver closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-enhancement
Milestone

Comments

@roji
Copy link
Member

roji commented Jul 20, 2024

SQL Server has a limitation, where it doesn't allow aggregate functions invocations over other aggregate functions, or over subqueries; it errors with: "Cannot perform an aggregate function on an expression containing an aggregate or a subquery".

Examples:

// Aggregate over scalar subquery
_ = await context.Blogs
    .Select(b => new
    {
        b.Category,
        FirstPostViews = b.Posts.OrderBy(p => p.Id).FirstOrDefault()!.Views
    })
    .GroupBy(x => x.Category)
    .Select(g => g.Sum(x => x.FirstPostViews))
    .ToListAsync();

// Aggregate over aggregate
_ = await context.Blogs
    .Select(b => new
    {
        b.Category,
        FirstPostViews = b.Posts.Sum(p => p.Views)
    })
    .GroupBy(x => x.Category)
    .Select(g => g.Sum(x => x.FirstPostViews))
    .ToListAsync();

The invalid SQL looks like this:

SELECT SUM((
    SELECT TOP(1) [p].[Views]
    FROM [Post] AS [p]
    WHERE [b].[Id] = [p].[BlogId]
    ORDER BY [p].[Id]))
FROM [Blogs] AS [b]
GROUP BY [b].[Category];

SELECT SUM((
    SELECT SUM([p].[Views])
    FROM [Post] AS [p]
    WHERE [b].[Id] = [p].[BlogId]))
FROM [Blogs] AS [b]
GROUP BY [b].[Category]

We should be able to have a post-processing step that lifts the subquery out of the projection, and integrates it into the SELECT expression as a OUTER APPLY (or CROSS JOIN, if uncorrelated):

SELECT SUM(x.Views)
FROM [Blogs] AS [b]
OUTER APPLY (
    SELECT TOP(1) [p].[Views]
    FROM [Post] AS [p]
    WHERE [b].[Id] = [p].[BlogId]
    ORDER BY [p].[Id]
) AS [x]
GROUP BY [b].[Category];

SELECT SUM([x].[ViewSum])
FROM [Blogs] AS [b]
OUTER APPLY (
    SELECT SUM([p].[Views]) AS [ViewSum]
    FROM [Post] AS [p]
    WHERE [b].[Id] = [p].[BlogId]) AS [x]
GROUP BY [b].[Category]

Note that this affects non-GroupBy queries as well.

Full code
await using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();

// Aggregate over scalar subquery
_ = await context.Blogs
    .Select(b => new
    {
        b.Category,
        FirstPostViews = b.Posts.OrderBy(p => p.Id).FirstOrDefault()!.Views
    })
    .GroupBy(x => x.Category)
    .Select(g => g.Sum(x => x.FirstPostViews))
    .ToListAsync();

// Aggregate over aggregate
_ = await context.Blogs
    .Select(b => new
    {
        b.Category,
        FirstPostViews = b.Posts.Sum(p => p.Views)
    })
    .GroupBy(x => x.Category)
    .Select(g => g.Sum(x => x.FirstPostViews))
    .ToListAsync();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Blog
{
    public int Id { get; set; }
    public string Category { get; set; }

    public List<Post> Posts { get; set; }
}

public class Post
{
    public int Id { get; set; }
    public int Views { get; set; }
    public Blog Blog { get; set; }
}
@roji
Copy link
Member Author

roji commented Jul 20, 2024

Note: see all referencing issues just above for user reports that should be fixed by implementing this.

@roji roji self-assigned this Jul 20, 2024
roji added a commit to roji/efcore that referenced this issue Jul 21, 2024
roji added a commit to roji/efcore that referenced this issue Jul 21, 2024
@roji roji modified the milestones: Backlog, 9.0.0 Jul 21, 2024
@roji roji added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jul 21, 2024
roji added a commit to roji/efcore that referenced this issue Jul 21, 2024
roji added a commit to roji/efcore that referenced this issue Jul 30, 2024
@roji roji closed this as completed in 390b5b0 Jul 30, 2024
@ajcvickers ajcvickers modified the milestones: 9.0.0, 9.0.0-rc1 Aug 21, 2024
@roji roji removed this from the 9.0.0-rc1 milestone Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query area-sqlserver closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants