Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EF core 6 selecting null values despite where clause asking for not null #26744

Closed
BewaAutomatisierung-RD opened this issue Nov 18, 2021 · 20 comments
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression Servicing-approved type-bug
Milestone

Comments

@BewaAutomatisierung-RD
Copy link

I have a Linq2Sql Query like this:

Parent.Include(p => p.Children)
  .Where(p => p.Children.Any(c => c.SomeNullableDateTime == null)
  	&& p.Children
  		.Where(c => c.SomeNullableDateTime == null)
  		.OrderBy(c => c.SomeInteger)
  		.First()
  		.SomeOtherNullableDateTime != null
  )
  .Select(p => p.Children
  		.Where(c => c.SomeNullableDateTime == null)
  		.OrderBy(c => c.SomeInteger)
  		.First()
  		.SomeOtherNullableDateTime)
  .ToList();

This worked fine until moving from EF core 5 to EF core 6. With EF core 6 the result list contains some null values (which should not be the case because the where condition asks for not null). Is there some breaking change / limitation in EF core 6 I'm not aware of or is this simply a bug?

This is an excerpt of the output

enter image description here

This is the generated SQL Statement

SELECT(
	SELECT TOP(1)[p1].[SomeOtherNullableDateTime]
	FROM[Children] AS[p1]
	WHERE([p].[Id] = [p1].[ParentId]) AND[p1].[SomeNullableDateTime] IS NULL
	ORDER BY[p1].[SomeInteger])
FROM[Parent] AS[p]
WHERE EXISTS(
	SELECT 1
	FROM[Children] AS [c]
	WHERE ([p].[Id] = [c].[ParentId]) AND[c].[SomeNullableDateTime] IS NULL) AND EXISTS(
   SELECT 1
   FROM[Children] AS [c0]
	WHERE ([p].[Id] = [c0].[ParentId]) AND[c0].[SomeNullableDateTime] IS NULL)
GO

So it looks like the problem is that SomeOtherNullableDateTime (which is supposed to be not null) is not even included in the where clause of the generated SQL.

@BewaAutomatisierung-RD
Copy link
Author

This is the SQL EF core 5 (correctly) generates

SELECT (
    SELECT TOP(1) [c].[SomeOtherNullableDateTime]
    FROM [Children] AS [c]
    WHERE ([p].[Id] = [c].[ParentId]) AND [c].[SomeNullableDateTime] IS NULL
    ORDER BY [c].[SomeInteger])
FROM [Parent] AS [p]
WHERE EXISTS (
    SELECT 1
    FROM [Children] AS [c0]
    WHERE ([p].[Id] = [c0].[ParentId]) AND [c0].[SomeNullableDateTime] IS NULL) AND (
    SELECT TOP(1) [c1].[SomeOtherNullableDateTime]
    FROM [Children] AS [c1]
    WHERE ([p].[Id] = [c1].[ParentId]) AND [c1].[SomeNullableDateTime] IS NULL
    ORDER BY [c1].[SomeInteger]) IS NOT NULL
GO

@BewaAutomatisierung-RD
Copy link
Author

On Stackoverflow (https://stackoverflow.com/questions/70017594/ef-core-6-selecting-null-values-despite-where-clause-asking-for-not-null/70019439#70019439) it has been stated by Evan Stoev that even if the query is formulated in other ways, the generated SQL is still wrong.

Example 1:

var query = db.Set<Parent>()
    .Select(p => p.Children
        .Where(c => c.SomeNullableDateTime == null)
        .OrderBy(c => c.SomeInteger)
        .FirstOrDefault())
    .Where(c => c.SomeOtherNullableDateTime != null)
    .Select(c => c.SomeNullableDateTime);

generates

SELECT (
    SELECT TOP(1) [c0].[SomeNullableDateTime]
    FROM [Child] AS [c0]
    WHERE ([p].[Id] = [c0].[ParentId]) AND [c0].[SomeNullableDateTime] IS NULL
    ORDER BY [c0].[SomeInteger])
FROM [Parent] AS [p]
WHERE EXISTS (
    SELECT 1
    FROM [Child] AS [c]
    WHERE ([p].[Id] = [c].[ParentId]) AND [c].[SomeNullableDateTime] IS NULL)

which is also missing the IS NOT NULL criteria

Example 2:

var query = db.Set<Parent>()
    .SelectMany(p => p.Children
        .Where(c => c.SomeNullableDateTime == null)
        .OrderBy(c => c.SomeInteger)
        .Take(1))
    .Where(c => c.SomeOtherNullableDateTime != null)
    .Select(c => c.SomeNullableDateTime);

generates

SELECT [t0].[SomeNullableDateTime]
FROM [Parent] AS [p]
INNER JOIN (
    SELECT [t].[ParentId], [t].[SomeNullableDateTime], [t].[SomeOtherNullableDateTime]
    FROM (
        SELECT [c].[ParentId], [c].[SomeNullableDateTime], [c].[SomeOtherNullableDateTime], ROW_NUMBER() OVER(PARTITION BY [c].[ParentId], [c].[SomeNullableDateTime] ORDER BY [c].[SomeInteger]) AS [row]
        FROM [Child] AS [c]
    ) AS [t]
    WHERE [t].[row] <= 1
) AS [t0] ON ([p].[Id] = [t0].[ParentId]) AND [t0].[SomeNullableDateTime] IS NULL
WHERE [t0].[SomeOtherNullableDateTime] IS NOT NULL

which is also wrong.

@BewaAutomatisierung-RD
Copy link
Author

I've just spend a couple of days migrating a project to .NET 6 and EF core 6 (it took a while because many-to-many relationships are now handled differently). And now I cannot trust my application anymore because I don't know which of the many queries produce the correct results and which don't !

@smitpatel
Copy link
Contributor

Regression from #18476

@sdanyliv
Copy link

@smitpatel, note that second sample is some regression in subquery optimization. Predicate is moved out of the Window and ROW_NUMBER produces wrong result.

@smitpatel
Copy link
Contributor

Filed #26756 for example 2

@BewaAutomatisierung-RD
Copy link
Author

Can you please give an ETA for fixing these bugs!? I cannot go live like this and need to know if I can wait for the fixes or if I have to go back to EF 5 .

@ajcvickers
Copy link
Member

@BewaControl-ReneDivossen Unfortunately, the window has been missed for the 6.0.1 patch (December) and the next patch after that is scheduled for February. This is something we would very much like to improve and have been discussing a lot on the team. However, the overall build and validation process for .NET is very slow, and the holiday period makes things worse, so there is no quick solution.

@BewaAutomatisierung-RD
Copy link
Author

@ajcvickers : This is very bad news! Like this, EF 6 is not ready for production. If we cannot rely on queries to deliver correct result sets we cannot use EF 6. The example given at the top of this issue is a litte academic (regarding the select part) but we do have code like this:

var items = Parent
  .Where(p => p.Children.Any(c => c.SomeNullableDateTime == null)
  	&& p.Children
  		.Where(c => c.SomeNullableDateTime == null)
  		.OrderBy(c => c.SomeInteger)
  		.First()
  		.SomeOtherNullableDateTime != null
        && p.SomeOtherConditions
  )
  .ToList();

// Do something with items that must only be done with items meeting the where-requirements

The worst case would be using a query like this to get items to be deleted. Since right now EF core 6 returns more items than meet the condition, there would be a loss of data!

Since all three described ways of getting items meeting the where condition provide wrong results I'm wondering if there is some other query that would work?

Can you confirm that the where condition generates the wrong SQL only because of the "!= null" part ??? Or will other where conditions on Children fail as well ??? Regarding the example using the "SelectMany": Will all queries using "SelectMany" be wrong?

@BewaAutomatisierung-RD
Copy link
Author

BewaAutomatisierung-RD commented Nov 19, 2021

@ajcvickers : I understand that flexibility is something hard to implement in large cooperations but I would like to suggest that you postpone the 6.0.1 release until these bugs are fixed. If it were bugs that would lead to exceptions being thrown because a query could not be translated correctly then I would be fine with leaving them in a release because the users would for sure notice that something ist wrong. But leaving bugs in a release that quietly deliver wrong results just feels wrong! Your users may mess up their whole data before even noticing that something is wrong. There is a good chance that some of your users will not be able to clean up the mess once they do find out about it! Some may never find out about it and will keep on making decisions based on wrong data!

@ajcvickers
Copy link
Member

@BewaControl-ReneDivossen I will let @smitpatel comment on the technical aspects. With regard to the patch schedule, that is not something that we control. However, we are thinking hard about how we can turn around patches more quickly in the future, at least for EF.

@BewaAutomatisierung-RD
Copy link
Author

@ajcvickers Thanks for the quick reply. At least now I know that I have to go back to EF 5 (which will be a pain because I've just removed all references to Join-Tables in our code because EF 6 scaffolding of many-to-many relationships is different from EF 5 and we are using database-first). My learning from this is: If you are working in a fairly small company and don't have unlimited resources then don't use Point-Zero versions ! (Which results in a conflict as well: If no one is using Point-Zero versions, no one is going to find the bugs.)

@m-gug
Copy link

m-gug commented Dec 1, 2021

We stumbled upon the same issue during our migration tests to EF Core 6. Fortunately, one of the unit tests hit it before we really switched.
Unfortunately, because of the error, incorrect results are returned for a large number of queries. You could avoid the error if you do not check for NULL but for HasValue, but that would mean changing all queries and hoping that nothing is overlooked.
If further information is necessary, we have a simple test project to reproduce the bug, just let us know.

Sample query that is not working in EF Core 6:
_context.Parent.Where(e => e.Children.FirstOrDefault().CreatedAt == null)

Possible workaround:
_context.Parent.Where(e => !e.Children.FirstOrDefault().CreatedAt.HasValue).ToListAsync()

@BewaAutomatisierung-RD
Copy link
Author

@ajcvickers : Will this be fixed with 6.0.2 ?

@BewaAutomatisierung-RD
Copy link
Author

@smitpatel : Can you please give some technical input on my question from 2021/11/19:
"Can you confirm that the where condition generates the wrong SQL only because of the "!= null" part ??? Or will other where conditions on Children fail as well ??? Regarding the example using the "SelectMany": Will all queries using "SelectMany" be wrong?"

@ajcvickers
Copy link
Member

@BewaControl-ReneDivossen The fix here is somewhat complicated, and won't make it for 6.0.2. We're still hoping we can come up with a fix that is low risk enough to include in 6.0.3.

smitpatel added a commit that referenced this issue Jan 26, 2022
…tion

Resolves #26744
A better fix for #18476

Initial fix for #18476 assumed that whenever we have single result operation compared to null, it will only be true if the result of single result is default when sequence is empty. This was correct for the query in the issue tracker which had anonymous type projection.
Anonymous type is never null as long as there is data, it can be only null value when default is invoked i.e. empty sequence.
Hence we added optimization for that but it didn't restrict to just anonymous type.
For entity type projection when entity is not nullable, the same logic holds true. This helped us translate queries which wouldn't work with entity equality due to composite key from a subquery.
But optimization was incorrect for the result which can be null (nullable scalar or nullable entity) as an non-empty sequence can have first result to be null which can match.

The improved fix avoids doing the unrestricted optimization during preprocessing phase. Instead we moved the logic to translation phase where we can evaluate the shape of the projection coming out subquery. Now we only apply optimization for non-nullable entity and anonymous type. Scalar comparison will work by comparing to null and nullable entity will work if entity equality covers it. It will start throwing error if composite key though earlier version possibly generated wrong results for it.
@smitpatel smitpatel added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jan 26, 2022
@BewaAutomatisierung-RD
Copy link
Author

@ajcvickers Since there is a fix for this issue and the related #26756 now - what is the ETA for the release ?

@ajcvickers ajcvickers removed this from the 6.0.x milestone Jan 27, 2022
@ajcvickers
Copy link
Member

@BewaControl-ReneDivossen This has been approved for the 6.0.3 patch release, which is currently planned for March.

maumar added a commit that referenced this issue Feb 11, 2022
…s after merging from release/6.0

Improvement on the previous patch fix (#26744). We were not taking into the account scenario where collection projected from FirstOrDefault was compared to null. Collection can only be null if the parent entity is null, so we can safely apply the same optimization we do for anonymous types and required entities.

Fixes #27356
maumar added a commit that referenced this issue Feb 11, 2022
…s after merging from release/6.0

Improvement on the previous patch fix (#26744). We were not taking into the account scenario where collection projected from FirstOrDefault was compared to null. Collection can only be null if the parent entity is null, so we can safely apply the same optimization we do for anonymous types and required entities.

Fixes #27356
dougbu pushed a commit that referenced this issue Mar 2, 2022
…s after merging from release/6.0 (#27429)

Improvement on the previous patch fix (#26744). We were not taking into the account scenario where collection projected from FirstOrDefault was compared to null. Collection can only be null if the parent entity is null, so we can safely apply the same optimization we do for anonymous types and required entities.

Fixes #27356
@space-alien
Copy link

space-alien commented Jul 13, 2023

I have just run into this in EF 7.0.8. I'm honestly shocked.

ServicePlan Model:

public DateTime? TerminationDate { get; private set; }

Query:

.Select(c => c.CurrentServicePlan == null ? (bool?)null : c.CurrentServicePlan.TerminationDate.HasValue)

This query returns false when CurrentServicePlan is null. 🥴

@ajcvickers
Copy link
Member

@space-alien Please open a new issue and attach a small, runnable project or post a small, runnable code listing that reproduces what you are seeing so that we can investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported regression Servicing-approved type-bug
Projects
None yet
Development

No branches or pull requests

6 participants