Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect results returned from query under heavy load #24332

Closed
bucky66 opened this issue Mar 5, 2021 · 4 comments
Closed

Incorrect results returned from query under heavy load #24332

bucky66 opened this issue Mar 5, 2021 · 4 comments

Comments

@bucky66
Copy link

bucky66 commented Mar 5, 2021

Incorrect results returned from query under heavy load

One of our services was recently subject to a DDoS attack. The attack created elevated access to a single Azure Sql table. During the attack, reads from the same table threw expected timeout errors like

System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

and

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

However, we occasionally saw the following error:

System.InvalidOperationException: The underlying reader doesn't have as many fields as expected.
at Microsoft.EntityFrameworkCore.Query.Internal.BufferedDataReader.BufferedDataRecord.InitializeFields()
at Microsoft.EntityFrameworkCore.Query.Internal.BufferedDataReader.BufferedDataRecord.InitializeAsync(DbDataReader reader, IReadOnlyList`1 columns, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.BufferedDataReader.InitializeAsync(IReadOnlyList`1 columns, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(DbContext _, Boolean result, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
at Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync[TSource](IAsyncEnumerable`1 asyncEnumerable, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor.SingleOrDefaultAsync[TSource](IAsyncEnumerable`1 asyncEnumerable, CancellationToken cancellationToken)

In addition to these errors we noticed that successful queries against the aforementioned table would return incorrect data. The code in question looks something like this:

var customer = await dbContext.Customers.SingleOrDefaultAsync(c => c.Id == request.CustomerId, cancellationToken);

After some investigation we added the following snippet just after the query:

if (request.CustomerId != customer.Id) 
{
	//log and throw exception
}

which resulted in many exceptions and logs indicating that the customer data returned from the query was different from the requested customer data.

We were also able to reproduce this problem in our test environment.

Steps to reproduce

  1. Create a target azure sql table
  2. Expose GET and POST endpoints through web api controller
  3. Subject to system to high load
    1. Used k6 to execute POST requests
    2. Table writes
  4. Simulate normal usage
    1. Used k6 to execute GET requests
    2. Table reads
  5. Wait for timeout and underlying reader exceptions

Note: we did decrease the maximum connection pool size to more easily reproduce the problem in the test environment.

Details

Microsoft.EntityFrameworkCore v3.1.12
Microsoft.EntityFrameworkCore.SqlServer v3.1.12
NETCoreApp v3.1.10

@Plasma
Copy link

Plasma commented Mar 6, 2021

I'm not on the EF Core team, but are you accidentally sharing your dbContext between requests?

They should be (if injected) at something like a Scope level (per-request), and not Singleton.

If you inject the context into a repository/broker class, perhaps check too that it's not being shared (and thus the dependency dbContext is shared) between requests.

You can double-check this by just printing out something as simple as the hash code (dbContext.GetHashCode()) between a few requests -- it should be changing.

@ErikEJ
Copy link
Contributor

ErikEJ commented Mar 6, 2021

Try the latest SqlClient 2.1.2 (assume you are on Linux)

@ajcvickers
Copy link
Member

EF Team Triage: Closing this issue as the requested additional details have not been provided and we have been unable to reproduce it.

BTW this is a canned response and may have info or details that do not directly apply to this particular issue. While we'd like to spend the time to uniquely address every incoming issue, we get a lot traffic on the EF projects and that is not practical. To ensure we maximize the time we have to work on fixing bugs, implementing new features, etc. we use canned responses for common triage decisions.

@bucky66
Copy link
Author

bucky66 commented Mar 17, 2021

We were able to create a sample project and reproduce but not before the issue was closed. Incidentally, a few days after we opened this issue, the fix was released in dotnet/SqlClient#950. With this fix we can no longer produce the incorrect data issue.

Also, thanks to ErikEJ for the great tip! Explicitly including SqlClient 2.1.2 also fixed the problem.

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants