Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) #248

cheenamalhotra · 2019-10-08T20:16:34Z

Ports dotnet/corefx#38271 by @Wraith2 to fix #109

(The PR is copy of the original PR, so we can review these changes first here)

cheenamalhotra · 2019-10-08T21:03:48Z

@Wraith2 this PR and the PR dotnet/corefx#38271 have same changes but the newly added tests are failing on Linux from this PR. Could you take a look?

Manual Tests Log: log_15_16802.zip

On Windows also the range tests are failing, with actual value 0.99. Logs: log_31_16801.zip

Wraith2 · 2019-10-08T23:07:55Z

The range failing sort of makes sense. There's clearly some difference in this repo to corefx to do with cancellation or #234 wouldn't have been needed. I think that PR blocks this PR from being effective because you can't get to this fix if the other blocks it and this is higher in the stack. So I'd advise queuing this one after the other and retesting once it's merged.

For netfx the test was never intended for that build. A value of .99 means it doesn't exhibit the problem that this PR is attempting to fix though which is probably good. I think it also means the timing is slightly wonky since it's supposed to cancel after 1 second and less than a second has elapsed, we could probably just relax the lower bound because it's the upper bound that signals the error condition.

Oh, and PlainCancelTestAsync isn't one that's changed in this PR. It looks from the log like it was coded to expect a specific piece of text in the error message not the type.

GSPP · 2019-12-17T08:24:00Z

There's now a lock that is optionally entered. I'm curious how that can possibly be safe? If entering the lock fails then execution simply continues with someone else holding the lock, or with no one holding the lock even.

Wraith2 · 2019-12-17T08:53:13Z

It's safe because there are only two ways to aquire it and one of them is the attention packet setup and send which doesn't mutate local state. If you're attempting to send the attention packet and the lock is already held you need to ignore the lock because even though something else is holding it you know you're not going to mutate and your very purpose is to cancel the thing that's holding the lock. If you're attempting to run and the lock is held then you need to wait for it.

GSPP · 2019-12-18T07:48:47Z

OK, so it is a requirement, that the lock is held but it does not matter by whom. Could it not happen that the lock is held, so taking the lock is skipped, and then immediately the lock is released by the code? That would allow execution without any lock held.

Wraith2 · 2019-12-18T09:14:36Z

Yes, The only order that this can happen in is that an execute can release the lock while an attention is also in flight and then a second execute can start, that shouldnt' cause a problem because the attention doesn't really mutate, though it will probably set the attention sent flag.

Do you have a suggestion on a better way to approach this? using locks in code shared between sync and async executions is a bit dangerous in my opinion anyway. What would meet the requirements in this case is a counted mutex object that you can try enter based on the current value, execute could only enter if it is <1 and attention can only enter if it's <2, i don't think the exeisting mutexes support this behaviour.

GSPP · 2019-12-21T10:24:54Z

@Wraith2 I am totally unqualified to give advice on this 😄. I don't know how the codebase works.

I have learned over the years that holding a lock for an unbounded amount of time (such as during IO) is usually an impossible design. Doing this locks out other access to the system/data (such as cancellation). My experience is that this creates functional problems that cannot be fully fixed under such a design.

Often, a way out of this is to use a state machine approach. For example, track ongoing and queued operations in a data structure. Each access of those structures happens under a lock within mere microseconds (no blocking, extremely rare contention). When an IO/operation completes it must check those structures and possible update them or initiate other work. I hope this does not sound too vague.

My understanding is that ADO.NET locks while doing IO. I have seen a number of problems on this issue tracker around cancellation, concurrency and locking. I got the impression that the concurrency design is not fundamentally correct. But this might be naive given that I am not familiar with the codebase.

Wraith2 · 2019-12-21T15:21:53Z

in general I agree but I'm not a position to totally rewrite the internals with lock free mechanisms at the moment. I'll put that on the list of things to consider.

…t38271 # Conflicts: # src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SNI/SNIPacket.cs # src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParserStateObjectManaged.cs

David-Engel

Since there are separate code paths for TCP vs named pipes connections, we should adjust the new tests to run for both scenarios.

src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/SqlCommand/SqlCommandCancelTest.cs

olljanat · 2020-11-24T09:08:57Z

@cheenamalhotra Afaiu this issue exists also on 1.1.x version and that why on EF Core 3.1. Any plans to port this fix to there too?

ErikEJ · 2020-11-24T10:37:51Z

@olljanat You can just update to use 2.1.0 with EF Core 3.x: https://erikej.github.io/efcore/sqlclient/2020/03/22/update_mds.html

cheenamalhotra added 2 commits October 8, 2019 13:13

Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI)

e161d9d

Fix default literal

4a81b64

cheenamalhotra changed the title ~~Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI)~~ WIP Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) Oct 8, 2019

cheenamalhotra added the Backport to CoreFx label Oct 8, 2019

This was referenced Oct 17, 2019

Fix Cancel #234

Closed

SqlClient Fix managed command cancellation dotnet/corefx#38271

Closed

Wraith2 mentioned this pull request Dec 16, 2019

Canceling SQL Server query with while loop hangs forever #44

Open

cheenamalhotra mentioned this pull request Jan 7, 2020

The MARS TDS header contained errors using ASP.NET Core and EF Core connecting to Azure SQL Server #85

Closed

Minor test changes

23abcf5

cheenamalhotra changed the title ~~WIP Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI)~~ Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) Jan 7, 2020

cheenamalhotra requested review from David-Engel, karinazhou and JRahnama January 7, 2020 17:15

Minor change to avoid random failures

922d3d4

David-Engel reviewed Jan 7, 2020

View reviewed changes

src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/SqlCommand/SqlCommandCancelTest.cs Outdated Show resolved Hide resolved

cheenamalhotra added 2 commits January 7, 2020 12:32

Run tests with both TCP and NP connection strings

39be357

Separate tests

be5565f

David-Engel reviewed Jan 7, 2020

View reviewed changes

src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/SqlCommand/SqlCommandCancelTest.cs Outdated Show resolved Hide resolved

Skip Named Pipes on Azure

58cdfc6

David-Engel approved these changes Jan 8, 2020

View reviewed changes

cheenamalhotra added this to the 2.0.0-preview1 milestone Jan 8, 2020

cheenamalhotra merged commit a700c9b into dotnet:master Jan 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) #248

Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) #248

cheenamalhotra commented Oct 8, 2019 •

edited

Loading

cheenamalhotra commented Oct 8, 2019 •

edited

Loading

Wraith2 commented Oct 8, 2019 •

edited

Loading

GSPP commented Dec 17, 2019 •

edited

Loading

Wraith2 commented Dec 17, 2019

GSPP commented Dec 18, 2019

Wraith2 commented Dec 18, 2019

GSPP commented Dec 21, 2019 •

edited

Loading

Wraith2 commented Dec 21, 2019

David-Engel left a comment

olljanat commented Nov 24, 2020

ErikEJ commented Nov 24, 2020

Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) #248

Port CoreFx PR 38271: Fix Statement Command Cancellation (Managed SNI) #248

Conversation

cheenamalhotra commented Oct 8, 2019 • edited Loading

cheenamalhotra commented Oct 8, 2019 • edited Loading

Wraith2 commented Oct 8, 2019 • edited Loading

GSPP commented Dec 17, 2019 • edited Loading

Wraith2 commented Dec 17, 2019

GSPP commented Dec 18, 2019

Wraith2 commented Dec 18, 2019

GSPP commented Dec 21, 2019 • edited Loading

Wraith2 commented Dec 21, 2019

David-Engel left a comment

Choose a reason for hiding this comment

olljanat commented Nov 24, 2020

ErikEJ commented Nov 24, 2020

cheenamalhotra commented Oct 8, 2019 •

edited

Loading

cheenamalhotra commented Oct 8, 2019 •

edited

Loading

Wraith2 commented Oct 8, 2019 •

edited

Loading

GSPP commented Dec 17, 2019 •

edited

Loading

GSPP commented Dec 21, 2019 •

edited

Loading