Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUIC: Peer closing connection can result in OperationCanceledException on pending write #58078

Closed
geoffkizer opened this issue Aug 24, 2021 · 5 comments · Fixed by #67651
Closed
Assignees
Milestone

Comments

@geoffkizer
Copy link
Contributor

I found this while looking into test issues.

Here's what I see:
(a) Server tries to send a GOAWAY on the client's control stream; this write pends
(b) Client closes the connection (this races with (a), but it happens pretty consistently)
(c) Server receives PEER_SEND_COMPLETE from msquic, and completes the pending write from (a) with OperationCanceledException: "Write was canceled"

I would expect this to result in QuicConnectionAbortedException instead.

Stack trace for (c) above:

  !!!!! ShutdownAsync: got exception System.OperationCanceledException: Write was canceled
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.HandleEventSendComplete(State state, StreamEvent& evt) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 1139
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.HandleEvent(State state, StreamEvent& evt) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 854
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.NativeCallbackHandler(IntPtr stream, IntPtr context, StreamEvent& streamEvent) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 830
  --- End of stack trace from previous location ---
     at System.Net.Quic.Implementations.MsQuic.Internal.ResettableCompletionSource`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\Internal\ResettableCompletionSource.cs:line 73
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.WriteAsync(ReadOnlyMemory`1 buffer, Boolean endStream, CancellationToken cancellationToken) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 296
     at System.Net.Test.Common.Http3LoopbackStream.SendFrameHeaderAsync(Int64 frameType, Int32 payloadLength) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 162
     at System.Net.Test.Common.Http3LoopbackStream.SendFrameAsync(Int64 frameType, ReadOnlyMemory`1 framePayload) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 167
     at System.Net.Test.Common.Http3LoopbackStream.SendGoAwayFrameAsync(Int64 firstInvalidStreamId) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 150
     at System.Net.Test.Common.Http3LoopbackConnection.ShutdownAsync() in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackConnection.cs:line 249
      System.Net.Http.Functional.Tests.SocketsHttpHandlerTest_HttpClientHandlerTest_Headers_Http3_MsQuic.GetAsync_EmptyResponseHeader_Success [FAIL]
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net untriaged New issue has not been triaged by the area owner labels Aug 24, 2021
@ghost
Copy link

ghost commented Aug 24, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

I found this while looking into test issues.

Here's what I see:
(a) Server tries to send a GOAWAY on the client's control stream; this write pends
(b) Client closes the connection (this races with (a), but it happens pretty consistently)
(c) Server receives PEER_SEND_COMPLETE from msquic, and completes the pending write from (a) with OperationCanceledException: "Write was canceled"

I would expect this to result in QuicConnectionAbortedException instead.

Stack trace for (c) above:

  !!!!! ShutdownAsync: got exception System.OperationCanceledException: Write was canceled
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.HandleEventSendComplete(State state, StreamEvent& evt) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 1139
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.HandleEvent(State state, StreamEvent& evt) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 854
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.NativeCallbackHandler(IntPtr stream, IntPtr context, StreamEvent& streamEvent) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 830
  --- End of stack trace from previous location ---
     at System.Net.Quic.Implementations.MsQuic.Internal.ResettableCompletionSource`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\Internal\ResettableCompletionSource.cs:line 73
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.WriteAsync(ReadOnlyMemory`1 buffer, Boolean endStream, CancellationToken cancellationToken) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 296
     at System.Net.Test.Common.Http3LoopbackStream.SendFrameHeaderAsync(Int64 frameType, Int32 payloadLength) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 162
     at System.Net.Test.Common.Http3LoopbackStream.SendFrameAsync(Int64 frameType, ReadOnlyMemory`1 framePayload) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 167
     at System.Net.Test.Common.Http3LoopbackStream.SendGoAwayFrameAsync(Int64 firstInvalidStreamId) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 150
     at System.Net.Test.Common.Http3LoopbackConnection.ShutdownAsync() in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackConnection.cs:line 249
      System.Net.Http.Functional.Tests.SocketsHttpHandlerTest_HttpClientHandlerTest_Headers_Http3_MsQuic.GetAsync_EmptyResponseHeader_Success [FAIL]
Author: geoffkizer
Assignees: -
Labels:

area-System.Net, untriaged

Milestone: -

@geoffkizer geoffkizer added area-System.Net.Quic and removed area-System.Net untriaged New issue has not been triaged by the area owner labels Aug 24, 2021
@ghost
Copy link

ghost commented Aug 24, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

I found this while looking into test issues.

Here's what I see:
(a) Server tries to send a GOAWAY on the client's control stream; this write pends
(b) Client closes the connection (this races with (a), but it happens pretty consistently)
(c) Server receives PEER_SEND_COMPLETE from msquic, and completes the pending write from (a) with OperationCanceledException: "Write was canceled"

I would expect this to result in QuicConnectionAbortedException instead.

Stack trace for (c) above:

  !!!!! ShutdownAsync: got exception System.OperationCanceledException: Write was canceled
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.HandleEventSendComplete(State state, StreamEvent& evt) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 1139
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.HandleEvent(State state, StreamEvent& evt) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 854
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.NativeCallbackHandler(IntPtr stream, IntPtr context, StreamEvent& streamEvent) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 830
  --- End of stack trace from previous location ---
     at System.Net.Quic.Implementations.MsQuic.Internal.ResettableCompletionSource`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\Internal\ResettableCompletionSource.cs:line 73
     at System.Net.Quic.Implementations.MsQuic.MsQuicStream.WriteAsync(ReadOnlyMemory`1 buffer, Boolean endStream, CancellationToken cancellationToken) in C:\github\geoffkizer\runtime\src\libraries\System.Net.Quic\src\System\Net\Quic\Implementations\MsQuic\MsQuicStream.cs:line 296
     at System.Net.Test.Common.Http3LoopbackStream.SendFrameHeaderAsync(Int64 frameType, Int32 payloadLength) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 162
     at System.Net.Test.Common.Http3LoopbackStream.SendFrameAsync(Int64 frameType, ReadOnlyMemory`1 framePayload) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 167
     at System.Net.Test.Common.Http3LoopbackStream.SendGoAwayFrameAsync(Int64 firstInvalidStreamId) in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackStream.cs:line 150
     at System.Net.Test.Common.Http3LoopbackConnection.ShutdownAsync() in C:\github\geoffkizer\runtime\src\libraries\Common\tests\System\Net\Http\Http3LoopbackConnection.cs:line 249
      System.Net.Http.Functional.Tests.SocketsHttpHandlerTest_HttpClientHandlerTest_Headers_Http3_MsQuic.GetAsync_EmptyResponseHeader_Success [FAIL]
Author: geoffkizer
Assignees: -
Labels:

area-System.Net.Quic

Milestone: -

@ManickaP
Copy link
Member

AFAIU it's because the event itself contains canceled flag:

The choice of the exception here is up to us, so I assume we can change it without any problems.

Triage: we should revisit this in 7.0 frame, related to #55619

cc @CarnaViire

@ManickaP ManickaP added this to the 7.0.0 milestone Aug 25, 2021
@CarnaViire
Copy link
Member

CarnaViire commented Aug 25, 2021

I've seen exactly same thing happening while investigating #53090.

The cause of OperationCanceledException is indeed Canceled flag inside PEER_SEND_COMPLETE event. It is not as easy to mitigate though.

  1. We cannot just change OperationCanceledException to QuicConnectionAbortedException, because:

    • Canceled=true might be set due to another reason, e.g. peer aborting reading (without closing the connection)
    • Even if it was canceled due to connection being closed, we don't have Connection's error code at that moment, because related events didn't happen yet
  2. I though I could just ignore PEER_SEND_COMPLETE if Canceled=true and defer handling to the following aborting events (PEER_RECEIVE_ABORTED or SHUTDOWN_COMPLETE with ConnectionClosed=true) but I've started getting INVALID_STATE exception from StreamSendDelegate instead. I think it's not because I ignored the event, but it's just another type of possible outcome from the race between writing and closing connection, which just got more visible. I think so because when ignoring the event, I left SendState in Pending, and if the next Write would start executing for some reason, it would fail on Debug.Assert in HandleWriteStartState.

@geoffkizer
Copy link
Contributor Author

2. I've started getting INVALID_STATE exception from StreamSendDelegate instead. I think it's not because I ignored the event, but it's just another type of possible outcome from the race between writing and closing connection, which just got more visible.

Yeah, it seems likely that this race already existed, but the change you made resulted in it happening more often.

I assume we need to deal with INVALID_STATE from StreamSend somehow, as there is just an inherent race here. Is there some way we can determine the stream state from msquic after this happens?

@rzikm rzikm self-assigned this Apr 6, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 6, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 8, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
4 participants