-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken connection on unix systems when executing multiple concurrent statements against a single connection #1620
Comments
You seem to be trying to use the same SqlConnection instance concurrently from multiple threads - that is not supported. Multiple active result sets (MARS) does not change this - connections are not thread-safe (see docs). |
@roji understand, but why are we getting an SNI connectivity error? I'm curious as to what that has to do with MARS. |
@Mike737377 once you do concurrent access to an API that isn't thread-safe (SqlConnection in this case), usually all bets are off and the behavior is completely undefined; in this case, you're effectively having multiple threads doing I/O on the same underlying socket, which causes everything to break down. MARS is about allowing multiple open readers, and does not provide thread-safety. That means that you must await e.g. ExecuteReaderAsync, but once that's completed and you have the returned reader, you can execute another query without first consuming the first reader. |
Does it just magically work on windows then? I would've at least thought the behaviour should be similar in this regard. |
Unless you're forcing the windows code to use the managed network implementation then you're seeing a difference between the managed and unmanaged implementations. So yes, it does sort of magically work on windows. As Roji said multithreaded use of connections is not safe. You've provided a replication though so I'm sure someone will take a look and see if there's a way we can prevent this problem. |
I've been looking into how the connection is checked and I've noticed that there seems to be 2 functions; In the SqlClient/src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs Line 2606 in dfa62a1
I also spotted that the logic in SqlClient/src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs Line 2618 in dfa62a1
Out of curiousty adding a call to However if I add some quick retry logic then I don't ever seem to loose my the connection which means the error is transient. internal override uint CheckConnection()
{
SNIHandle handle = Handle;
uint result = TdsEnums.SNI_SUCCESS;
if (handle != null)
{
for (var attempt = 1; attempt <= MaxConnectionCheckAttempts; attempt++)
{
result = handle.CheckConnection();
if (result == TdsEnums.SNI_SUCCESS)
{
return result;
}
SqlClientEventSource.Log.TryTraceEvent("TdsParserStateObjectManaged.CheckConnection | Info | Connection check failed result {0}, Attempts: {1}/{2}", result, attempt, MaxConnectionCheckAttempts);
}
}
return result;
} |
I highly suspect that given the right timing and usage patterns, things would fail on Windows as well. That's the thing with concurrency bugs - they're timing sensitive, and exactly when they cause failures (and which type) is undefined and random. Whatever you happen to be seeing on Windows with your code, concurrent usage of SqlConnection is not supported. |
@roji and @Wraith2 thanks for providing accurate responses. @Mike737377 I also agree on this matter with @roji and @Wraith2. MARS does not change anything in this scenario. As Roji has mentioned:
|
@roji @Wraith2 @JRahnama thanks for your comments so far and I understand that you're saying it's not officially supported. However reading between the lines, does this mean that this is a "won't fix" issue? Do you want me to raise the impossible logic statement that I raised in my previous comment as a seperate issue?
|
As an community contributor I have more freedom to speak bluntly than others. This is a complex issue which is going to be a nightmare to debug and isn't an issue that I would give much priority because it's skirting around unsupported behaviour so even if I fix it there's a decent chance that you'll immediately hit another issue with a similar cause. I wouldn't say "won't fix" it's just not a good value proposition to spend time on. I will say that I've spent time on more obscure issues in this repo in the past.
|
You're right. The double negation got me on this one. Thanks for your comments @Wraith2. I've been rewriting the code to be non concurrent and with MARS disabled so hopefully I can put the issue to bed in our code base. |
Note that you don't have to disable MARS, which allows multiple resultsets (readers) to be open at the same time. That's unrelated to the above discussion, which is about using SqlClient APIs concurrently from multiple threads. Having said that, MARS has its own issues, and it's not a bad idea to move away from it - but that's orthogonal to the discussion above. |
Closing this as it is not an issue in the library. |
Describe the bug
On unix based systems the following exception is thrown when a single connection is being shared by multiple tasks/threads. The number of threads does not seem to be too relevant as long as there are 2+ threads involved.
Other threads throw:
System.InvalidOperationException: Invalid operation. The connection is closed.
System.InvalidOperationException: BeginExecuteReader requires an open and available Connection. The connection's current state is open.
System.InvalidOperationException: BeginExecuteReader requires an open and available Connection. The connection's current state is closed.
It will typically take somewhere between 10,000 to 10,000,000 sql statements to occur over the connection before the exception happens. By increasing the minimum number of threads into the hundreds the thread pool via
ThreadPool.SetMinThreads
the exception will usually present itself earlier on.This exception is problematic under our use case as this is a long running transactional batch import operation, otherwise we would just reconnect again and continue.
Running the same version of the code on windows executes successfully.
To reproduce
The following is a harsh simulation of what we are doing however I'm unable to reproduce the exact error as I encounter previously reported errors (#422, #826) before I can reproduce this one.
Expected behavior
SqlConnection stays open
Further technical details
.NET target: .net 5 & .net 6
SQL Server version: SQL Server 14.0.2037.2, Azure SQL instance (general purpose serverless gen 5)
Operating system: 5.4.0-110-generic #124-Ubuntu, Ubuntu Focal 20.04 inside a docker container
Reproducable against:
Additional context
I've grabbed a copy of the code (commit
#dfa62a1746
) and added some extra tracing and can see that SNIHandle.CheckConnection() in the following stack trace is returning1
.Using dotnet trace to collect the event source I can see the following
Note that the
ValidateSNIConnection
is one of the tracing points I added to determine thatCheckConnection
was returning something other than success.I'm happy to try anything that will either help pinpoint the issue or attempt a work around.
The text was updated successfully, but these errors were encountered: