-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Queries really slow due to null checks #17543
Comments
@smitpatel @maumar What's the risk in doing the optimization here? Perf could be a reason to fix this for 3.0. |
/cc @divega |
I think it is related to #15892 which is about bringing back the additional layer of cache based on null/non-null parameter values. |
Also, we still have the ability to switch on database null semantics as a workaround, correct? |
To be clear, in the example above WHERE (@__testId_0 IS NULL AND [r].[TestId] IS NULL) OR ([r].[TestId] = @__testId_0) |
As @jamesgurung says, the original query above could use some null semantics improvements which are unrelated to a parameter-sniffing cache layer: @maumar is aware of this. @jamesgurung, can you please report on the perf of your suggestion in #17543 (comment))? |
The query I suggested runs in about 0.05 seconds, which is the same speed as running the query without any null checks. So ~50x faster. |
@jamesgurung thanks for the info. It's disappointing that SQL Server doesn't short-circuit this entirely, but in any case we'll be removing the unnecessary checks on our side. |
related: #16221 |
Please speed up this issues |
Yeah this needs to be fixed asap. We just deployed code that uses 3.0 and had to immediately revert to 2.2 because simple queries blew up our SQL Azure CPU usage. Went from under 50% to 100% and stayed there until we rolled back. |
Is there any possible mitigation for this in the 3.0 release? |
@mscrivo and others, this is currently to planned for fixing in 3.1, which will be released by the end of the year. We won't be fixing this in 3.0. |
@roji Thanks for the update. That's rather disappointing as it means we have to either rollback a month's worth of work and maintain it on a branch until 3.1 is out, or scramble and find all places in our code where this pattern exists and rewrite it to force EF Core to generate different SQL. This should have definitely been called out in the known issues of the release notes. Here's what we're seeing for reference: declare @_param_ID_0 bigint=111111
-- EF Core 3.0
SELECT [i].[ID], ...
FROM [Table] AS [i]
WHERE (
(
([i].[OtherNullableID] = @_param_ID_0) AND ([i].[OtherNullableID] IS NOT NULL AND @_param_ID_0 IS NOT NULL)
)
OR
(
[i].[OtherNullableID] IS NULL AND @_param_ID_0 IS NULL
)
)
AND ([i].[Type] <> 1)
/*
Table 'Table'. Scan count 13, logical reads 1983575, physical reads 6539, page server reads 0, read-ahead reads 1651369, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 57124 ms, elapsed time = 10894 ms.
*/
-- EF Core 2.2
(@_param_ID_0 bigint)SELECT [_].[ID], ...
FROM [Table] AS [_]
WHERE ([_].[OtherNullableID] = @_param_ID_0) AND ([_].[Type] <> 1)
/*
Table 'Table'. Scan count 2, logical reads 8, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
*/ On a table with 91 million records. Not big by any stretch. |
We encountered a SQL Server 100% CPU utilization issue on Alibaba Cloud with many |
This is part of #17543 - Queries really slow due to null checks Problem was that at the time we performed null semantics and related optimizations we didn't know the value of parameters, so we always had to assume they can be nullable, which resulted in additional IS NULL checks being created. However, later in the pipeline we know the parameter values and we can optimize out those checks for parameters whose values are not null.
This is part of #17543 - Queries really slow due to null checks Problem was that at the time we performed null semantics and related optimizations we didn't know the value of parameters, so we always had to assume they can be nullable, which resulted in additional IS NULL checks being created. However, later in the pipeline we know the parameter values and we can optimize out those checks for parameters whose values are not null.
This is part of #17543 - Queries really slow due to null checks Problem was that at the time we performed null semantics and related optimizations we didn't know the value of parameters, so we always had to assume they can be nullable, which resulted in additional IS NULL checks being created. However, later in the pipeline we know the parameter values and we can optimize out those checks for parameters whose values are not null.
This is part of #17543 - Queries really slow due to null checks Problem was that at the time we performed null semantics and related optimizations we didn't know the value of parameters, so we always had to assume they can be nullable, which resulted in additional IS NULL checks being created. However, later in the pipeline we know the parameter values and we can optimize out those checks for parameters whose values are not null.
Partial fix has been checked in - 069d334 This addresses queries that heavily rely on SQL parameters (closure variables). Outstanding issues are adding "simplified" null semantics optimization that would not add extra terms in non-negated predicates to distinguish null from false, and optimize scenarios around enum flags (#18500) |
FYI, fix got into the latest nightly build (3.1.0-preview2.19522.3) |
@maumar Thank you. Is this a full fix to the null semantics as described in this issue, or the partial fix you mentioned above? |
@jamesgurung just the partial fix for now. It should be an improvement for scenarios that heavily use parameters, but there are more optimizations to do (and I will report the progress on this thread as they get checked in). With the current fix, the original query presented in the issue will now look like this: SELECT TOP(1) [r].[Id]
FROM [Results] AS [r]
WHERE (([r].[TestId] = @__testId_0) AND [r].[TestId] IS NOT NULL) I will be adding more optimizations shortly and will post updates on this thread. |
update: checked in another optimization that deals with enum flags (and other complex expressions compared to null). See #16078 for additional info. |
Resolves #17543 - Queries really slow due to null checks Resolves #18525 - Query: optimize binary expression AndAlso and OrElse where left and right are the same Resolves #18547 - DbFunctions compared to NULL are ignored and break the query builder #17543 Before when rewriting null comparisons we would always remove possibility of nulls in the resulting expression. This is not always needed, specifically in predicates, where it doesn't really matter if expression returns false or null - the result is the same. Now we detect those cases and apply "simplified" null expansion which should lead to better performance. #18525 Null semantics initially expands expressions to a verbose form which then gets simplified in query optimizer. One of the optimizations added is when we do AND/OR where both sides are the same. Other small improvements have been added in this change also. #18525 Previously we assumed that function can only be null if at least one of its arguments is null. This is the case for most built-in functions, but not all. This also goes for user defined functions which can return arbitrary results. Fix is to treat all functions as potentially nullable. This leads to worse queries is some cases, but is also mitigated by other improvements added along side. In the future we will provide metadata to better determine given function's nullability. Also fixed a few small bug found along the way - incorrect Equals comparison for SqlExpression, for cases when TypeMapping was null.
Resolves #17543 - Queries really slow due to null checks Resolves #18525 - Query: optimize binary expression AndAlso and OrElse where left and right are the same Resolves #18547 - DbFunctions compared to NULL are ignored and break the query builder #17543 Before when rewriting null comparisons we would always remove possibility of nulls in the resulting expression. This is not always needed, specifically in predicates, where it doesn't really matter if expression returns false or null - the result is the same. Now we detect those cases and apply "simplified" null expansion which should lead to better performance. #18525 Null semantics initially expands expressions to a verbose form which then gets simplified in query optimizer. One of the optimizations added is when we do AND/OR where both sides are the same. Other small improvements have been added in this change also. #18525 Previously we assumed that function can only be null if at least one of its arguments is null. This is the case for most built-in functions, but not all. This also goes for user defined functions which can return arbitrary results. Fix is to treat all functions as potentially nullable. This leads to worse queries is some cases, but is also mitigated by other improvements added along side. In the future we will provide metadata to better determine given function's nullability. Also fixed a few small bug found along the way - incorrect Equals comparison for SqlExpression, for cases when TypeMapping was null. Additional refactoring: - removed the second pass of sql optimizations (we do it later when sniffing parameter values anyway) - consolidated NullComparisonTransformingExpressionVisitor into SqlExpressionOptimizingExpressionVisitor
Resolves #17543 - Queries really slow due to null checks Resolves #18525 - Query: optimize binary expression AndAlso and OrElse where left and right are the same Resolves #18547 - DbFunctions compared to NULL are ignored and break the query builder #17543 Before when rewriting null comparisons we would always remove possibility of nulls in the resulting expression. This is not always needed, specifically in predicates, where it doesn't really matter if expression returns false or null - the result is the same. Now we detect those cases and apply "simplified" null expansion which should lead to better performance. #18525 Null semantics initially expands expressions to a verbose form which then gets simplified in query optimizer. One of the optimizations added is when we do AND/OR where both sides are the same. Other small improvements have been added in this change also. #18547 Previously we assumed that function can only be null if at least one of its arguments is null. This is the case for most built-in functions, but not all. This also goes for user defined functions which can return arbitrary results. Fix is to treat all functions as potentially nullable. This leads to worse queries is some cases, but is also mitigated by other improvements added along side. In the future we will provide metadata to better determine given function's nullability. Also fixed a few small bug found along the way - incorrect Equals comparison for SqlExpression, for cases when TypeMapping was null. Additional refactoring: - removed the second pass of sql optimizations (we do it later when sniffing parameter values anyway) - consolidated NullComparisonTransformingExpressionVisitor into SqlExpressionOptimizingExpressionVisitor
@maumar Fantastic - will be upgrading from 2.x to 3.1 Preview 2 when it's out! |
I'm getting a lot of slowdowns on the EF Core 3 preview. Here's an example.
This generates the following SQL, which runs really slowly on my SQL Azure database (>3 seconds):
The SQL I would expect is (<0.1 seconds):
Further technical details
EF Core version: 3.0.0-preview8.19405.11
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 10
IDE: Visual Studio 2019 Version 16.3 Preview 2
The text was updated successfully, but these errors were encountered: