Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashset contains becomes a traditional IN statement instead of the new OpenJSON parameter #32365

Closed
ngelotte opened this issue Nov 20, 2023 · 19 comments · Fixed by #33920
Closed
Assignees
Labels
area-perf area-primitive-collections area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-enhancement
Milestone

Comments

@ngelotte
Copy link

File a bug

If you use a HashSet for the parameter which is used by contains it generates the old style non parameterized query

HashSet<string> filterData = ["Thing1", "Thing2"];
var results =  db.TableOfStuff.Where(t => filterData.Contains(t.Item)).ToList();

The following code generates the new parameterized query but it seems like an unnecessary ToList and an easy place to miss adding the ToList and end up with a performance degradation.

HashSet<string> filterData = ["Thing1", "Thing2"];
var results =  db.TableOfStuff.Where(t => filterData.ToList().Contains(t.Item)).ToList();

Include provider and version information

EF Core version: 8.0
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .Net 8.0
Operating system: Windows
IDE: Visual Studio 17.8

@roji roji self-assigned this Nov 20, 2023
@AlexeiScherbakov

This comment was marked as off-topic.

@roji

This comment was marked as off-topic.

@AlexeiScherbakov

This comment was marked as off-topic.

@roji
Copy link
Member

roji commented Nov 21, 2023

Thanks for reporting this, I've confirmed the bug and will investigate.

Repro
await using var ctx = new BlogContext();
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();

var ids = new HashSet<int> { 1, 2 };
_ = ctx.Blogs.Where(x => ids.Contains(x.Id)).ToList();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Blog
{
    public int Id { get; set; }
    public string? Name { get; set; }
}

@dggmez
Copy link

dggmez commented Nov 22, 2023

Came here to report this. It seems it works correctly with List, IEnumerable and arrays. It doesn't work with HashSet, ImmutableList, ImmutableArray, ImmutableHashSet, FrozenSet, etc.

@ibrahim33
Copy link

Hello everyone. Is there an ETA when this will be released? We are currently using a custom implementation of OPENJSON and we cannot remove it nor upgrade to EF 8 because of the bug.

LINQ with HashSet:

var productSets= await dbContext.ProductSets()
            .Where(ps => ids.ToHashSet().Contains(ps.ProductId))
            .ToListAsync(cancellationToken)

QUERY with HashSet:

SELECT [p].[Id], [p].[ProductId], [p].[ReactivateIfSubProductsReappear], [p].[Version], [p0].[Id], [p0].[IsMainProduct], [p0].[ProductId], [p0].[ProductSetId], [p0].[Quantity]
FROM [externaldata].[ProductSet] AS [p]
LEFT JOIN [externaldata].[ProductSetProduct] AS [p0] ON [p].[Id] = [p0].[ProductSetId]
WHERE [p].[ProductId] IN (CAST(213171 AS bigint), CAST(234107 AS bigint), CAST(254039 AS bigint), CAST(252018 AS bigint), CAST(257030 AS bigint), CAST(282759 AS bigint), CAST(290215 AS bigint), CAST(352284 AS bigint), CAST(354318 AS bigint), CAST(354179 AS bigint))
ORDER BY [p].[Id]

LINQ with List:

var productSets= await dbContext.ProductSets()
            .Where(ps => ids.ToList().Contains(ps.ProductId))
            .ToListAsync(cancellationToken)

QUERY with List:

SELECT [p].[Id], [p].[ProductId], [p].[ReactivateIfSubProductsReappear], [p].[Version], [p0].[Id], [p0].[IsMainProduct], [p0].[ProductId], [p0].[ProductSetId], [p0].[Quantity]
FROM [externaldata].[ProductSet] AS [p]
LEFT JOIN [externaldata].[ProductSetProduct] AS [p0] ON [p].[Id] = [p0].[ProductSetId]
WHERE [p].[ProductId] IN (
    SELECT [i].[value]
    FROM OPENJSON(@__ids_0) WITH ([value] bigint '$') AS [i]
)
ORDER BY [p].[Id]

@roji
Copy link
Member

roji commented Jan 11, 2024

we cannot remove it nor upgrade to EF 8 because of the bug

Why is this blocking you from upgrading to EF 8? EF 8 simply behaves in the same way as EF 7 on this particular point.

@ibrahim33
Copy link

we cannot remove it nor upgrade to EF 8 because of the bug

Why is this blocking you from upgrading to EF 8? EF 8 simply behaves in the same way as EF 7 on this particular point.

It is just because of the way we implemented our version of OPENJSON, which overrides some methods of the ef core sql generator and some of these methods no longer work the same way in EF 8 or are removed. We could certainly adjust the implementation to make it work, but we would rather not invest this time if this issue can be addressed in the near future. That is the reason I am asking for an ETA

@roji
Copy link
Member

roji commented Jan 11, 2024

I'm not sure how this particular problem could be a blocker - if some methods which you're overriding changed between EF 7 and 8, then you'll need to change your implementation after this is fixed as well.

In any case, I definitely want to fix this for 9, but I think it's pretty unlikely we'll backport the fix to 8, since it's not a regression and we haven't had many complaints yet.

@ibrahim33
Copy link

if some methods which you're overriding changed between EF 7 and 8, then you'll need to change your implementation after this is fixed as well.

It is not a blocker but we would then completely remove our custom implementation when this is fixed, instead of having a hybrid solution in which lists are handled by EF and Hashsets by us.

In any case, I definitely want to fix this for 9, but I think it's pretty unlikely we'll backport the fix to 8, since it's not a regression and we haven't had many complaints yet.

Thank you for the info.

@ADNewsom09
Copy link
Contributor

ADNewsom09 commented Jan 18, 2024

You can do .AsEnumerable().Contains( on the hashset and it will pass as an openjson call without the overhead.

It appears the issue is it is translating HashSet's overridden .Contains method differently from IEnumerable's or ICollections'

@m-gasser
Copy link

I'm not sure how this particular problem could be a blocker - if some methods which you're overriding changed between EF 7 and 8, then you'll need to change your implementation after this is fixed as well.

It probably is not a blocker for anyone except us. We are trying since many years to allow our developers to use EF Core in addition to our inhouse ORM. With EF Core 7 and custom optimizations for IN Statements and for #30912 we could finally reach performance that was good enough to greenlight EF Core for most of our databases.
Most of our .Contains() statements are with sets and not lists, so it is crucial for us to get the OpenJSON optimization there too. It would be fine, if we have to customize EF Core for that, like we did for EF Core 7. We have to keep some customization anyways for #30912. Our problem is, we can no longer hook in at the same place, as we did with EFCore 7, because things are now handled differently.
Any pointer where we can hook into EF Core 8 to reimplement the OpenJSON support for sets would be greatly appreciated!

In any case, I definitely want to fix this for 9, but I think it's pretty unlikely we'll backport the fix to 8, since it's not a regression and we haven't had many complaints yet.

We are aware that compared to EFCore 7 it is not a regression but a big improvement! It is however a regression compared to our customization that we have to do, to reach our performance targets. It would be quite unfortunate if we had to stay on EF Core 7 for another whole year (and ironically for the feature of .NET 8 we were the most excited about).

@OndrejUzovic
Copy link

Please do not fix this bug until #32147 is fixed.
This bug acts as an important workaround for #32147.

@m-gasser
Copy link

@OndrejUzovic You can also use EF.Constant() as a workaround for that bug. This is a supported feature and will continue to work in future versions. This seems like the better workaround to use, no need to rely on another bug and trust that the other bug does not get fixed.

@roji
Copy link
Member

roji commented Feb 23, 2024

@OndrejUzovic @m-gasser EF.Constant() is indeed a good workaround, but has quite a heavy performance overhead. However, it's very likely we'll provide a more first-class way to opt-out of OPENJSON translation for 9.0, so both these bugs will likely go away.

@vas6ili
Copy link

vas6ili commented Apr 23, 2024

You can also use EF.Constant() as a workaround for that bug. This is a supported feature and will continue to work in future versions.

While I agree with this general statement, we find ourselves in a fairly unusual situation where we're in the process of migrating a fairly large application from EF6 to EF Core and have to support both ORMs for a some time before we sort out all the bugs/differences and the same LINQ expressions have to be translatable in both ORMs, in this case the HashSet<> workaround worked better for us.

@cincuranet cincuranet assigned cincuranet and unassigned roji May 28, 2024
@cincuranet cincuranet modified the milestones: Backlog, 9.0.0 May 29, 2024
@cincuranet
Copy link
Contributor

Note to self (from design): We should handle Contains on any "ICollection", not only HashSet.

@cincuranet cincuranet added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jun 6, 2024
@OndrejUzovic
Copy link

Since by fixing this bug the workaround for #32147 will not work anymore.
Are there any plans to fix also #32147?

@roji
Copy link
Member

roji commented Jun 10, 2024

@OndrejUzovic yeah, we're definitely planning on fixing that (or providing a better workaround) for 9.0.

@ajcvickers ajcvickers modified the milestones: 9.0.0, 9.0.0-preview6 Jun 21, 2024
@roji roji modified the milestones: 9.0.0-preview6, 9.0.0 Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf area-primitive-collections area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.