-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Global query filter generates SQL with unnecessary columns in projections #20758
Comments
Are you saying you're seeing a perf difference between those two specific SQL queries, i.e. with and without the added columns in the inner query? If so, can you share the benchmark and results? Even if there's no perf issue here (which is what I'd expect), we could still generate tighter SQL (but that would be a lower priority). |
After retesting with other tools, performance of two queries is similar. Difference is less than 2%. Sorry about this, I obviously used wrong tool for benchmark. But tighter SQL would be nice, especially for tables with lot of columns. |
@DanielBlazevic thanks for confirming. Am reopening the issue and putting in the backlog as an SQL improvement issue (low priority). |
Just to add a data point to this... I have a query which selects from 10 tables and produces a SQL query 4,167 characters long. If I add a common global filter on each of the tables, the total query is 23,375 characters. The filters themselves only add about 90 characters per table, 900 in total, so most of that is coming from the long lists of columns that are now being selected in sub-queries. I understand it's not really a perf issue, but it does make reviewing queries that little bit more difficult and I just wanted to give a "real world" example of how big the difference can be for when this gets reviewed again. |
Another little annoyance to add- Application Insights seems to limit the logged SQL to 8kb, so with the above example it's the difference between seeing the full query and only seeing a third of it. |
So, finally upgraded a big solution to EF Core 5.0 and... this is fixed? Sub-queries now only select the columns that are used in the final projection or as part of the join. Great news. Quick example with the BloggingContext, with some filters: modelBuilder.Entity<Blog>().HasQueryFilter(x => x.Rating > 4);
modelBuilder.Entity<Post>().HasQueryFilter(x => x.Title.StartsWith("The")); and this query: _ = db.Posts.Select(x => new
{
x.Blog.Url,
x.Content
}).FirstOrDefault(); 5.0: SELECT TOP(1) [t].[Url], [p].[Content]
FROM [Posts] AS [p]
INNER JOIN (
SELECT [b].[BlogId], [b].[Url]
FROM [Blogs] AS [b]
WHERE [b].[Rating] > 4
) AS [t] ON [p].[BlogId] = [t].[BlogId]
WHERE [p].[Title] IS NOT NULL AND ([p].[Title] LIKE N'The%') 3.1: SELECT TOP(1) [t].[Url], [p].[Content]
FROM [Posts] AS [p]
INNER JOIN (
SELECT [b].[BlogId], [b].[Rating], [b].[Url]
FROM [Blogs] AS [b]
WHERE [b].[Rating] > 4
) AS [t] ON [p].[BlogId] = [t].[BlogId]
WHERE [p].[Title] IS NOT NULL AND ([p].[Title] LIKE N'The%') 3.1 selected Rating in the subquery. The difference is small in this example, but on a big table, 3.1 selecting all columns made it very verbose and harder to read. The issue of whether it should do a subquery at all is another issue imo. It has been mentioned in other places that filtering in a subquery may be perf beneficial compared to a simple join with the filters on the outer where. If this issue is just about selecting "unnecessary columns", as per the title, then it's fixed for me! |
Fixed in #21992 Thanks @stevendarby for testing out. |
@smitpatel I noticed an exception to this where unused columns are still selected when it's part of a set operation that produces a cross apply. Not sure of the exact conditions for it but I have a repro. Obviously this is a fairly minor issue, mostly about tighter/nicer looking SQL, but would be nice to have. using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
{
using var context = new MyContext();
context.Database.EnsureCreated();
// Only necessary columns are selected in sub query on Images
_ = context.Posts
.Select(x => new { Type = "Image", Detail = x.Image.Title })
.ToList();
// All columns are selected in sub query on Images
_ = context.Blogs
.SelectMany(
blog => blog.Posts.Select(x => new { Type = "Post", Detail = x.Title })
.Concat(blog.Posts.Select(x => new { Type = "Image", Detail = x.Image.Title })),
(blog, detail) =>
new BlogDetailsDto
{
BlogName = blog.Name,
Type = detail.Type,
Detail = detail.Detail
})
.ToList();
}
public class MyContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
public DbSet<Post> Posts { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseSqlServer("Server=.;Database=Join;Trusted_Connection=True;")
.LogTo(x => Debug.WriteLine(x), LogLevel.Information);
protected override void OnModelCreating(ModelBuilder modelBuilder)
=> modelBuilder.Entity<Image>().HasQueryFilter(x => !x.IsDeleted);
}
public class BlogDetailsDto
{
public string BlogName { get; set; }
public string Type { get; set; }
public string Detail { get; set; }
}
public class Blog
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Post> Posts { get; set; }
}
public class Post
{
public int Id { get; set; }
public string Title { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public int ImageId { get; set; }
public Blog Blog { get; set; }
public Image Image { get; set; }
}
public class Image
{
public int Id { get; set; }
public string Title { get; set; }
public string Url { get; set; }
public bool IsDeleted { get; set; }
public Post Post { get; set; }
} Second query SQL: SELECT [b].[Name] AS [BlogName], [t].[Type], [t].[Detail]
FROM [Blogs] AS [b]
CROSS APPLY (
SELECT N'Post' AS [Type], [p].[Title] AS [Detail]
FROM [Posts] AS [p]
WHERE [b].[Id] = [p].[BlogId]
UNION ALL
SELECT N'Image' AS [Type], [t0].[Title] AS [Detail]
FROM [Posts] AS [p0]
INNER JOIN (
SELECT [i].[Id], [i].[IsDeleted], [i].[Title], [i].[Url] -- Here
FROM [Image] AS [i]
WHERE [i].[IsDeleted] = CAST(0 AS bit)
) AS [t0] ON [p0].[ImageId] = [t0].[Id]
WHERE [b].[Id] = [p0].[BlogId]
) AS [t] |
Pruning not being applied to set operations was fixed in #28142 |
Oh great, thank you and apologies for not checking more recent builds. |
Steps to reproduce
When using projection without global query filter, generated SQL is as expected. Only two columns are selected in this example:
And generated SQL is as expected:
If I add global query filter:
Generated SQL looks like this:
So, my questions are:
I am asking because I measured performance degradation of complex queries with global query filter enabled and unnecessary columns in left joins.
Further technical details
EF Core version: 3.1.3
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019 16.5.4
The text was updated successfully, but these errors were encountered: