Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in HttpClient timeout handling and GetAsync_ContentCanBeCanceled #44169

Merged
merged 1 commit into from
Nov 3, 2020

Conversation

stephentoub
Copy link
Member

Fixes #44003

There are three issues here.

Two tests issues:

  1. When the client times out, it could do so while the server is still reading the request or sending the response, in which case the server can fail and cause the test to fail. The server failure needs to be ignored.
  2. when the client times out, it could do so before it even initiates the request, in which case the test would hang while the server waits indefinitely for a request that'll never arrive. The server waiting needs to factor in the client's completion.

And one product issue:

  1. The timeout handling we added in .NET 5 has a race condition that can, in extreme conditions (e.g. a unit test trying to force the interleaving) result in us not throwing the appropriate TimeoutException. There are two different timer mechanisms used, the one based on Timer inside of CancellationTokenSource, and then the use of Envrionment.TickCount64 to track how much time has progressed. Their quantums are different, so it's possible for the timer to fire due to the timeout expiring but Environment.TickCount64 not ticking over, at which point we'll read it and determine that the cause of the cancellation wasn't for a timeout. The fix is to use the pending requests token source to check whether it or the timeout occurred, rather than using the time elapsed to determine which of the two occurred.

cc: @geoffkizer, @wfurt, @alnikola

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 2, 2020
@stephentoub stephentoub changed the title Fix race condition in timeout handling and GetAsync_ContentCanBeCanceled Fix race condition in HttpClient timeout handling and GetAsync_ContentCanBeCanceled Nov 2, 2020
There are three issues here.

Two tests issues:
1. When the client times out, it could do so while the server is still reading the request or sending the response, in which case the server can fail and cause the test to fail.  The server failure needs to be ignored.
2. when the client times out, it could do so before it even initiates the request, in which case the test would hang while the server waits indefinitely for a request that'll never arrive.  The server waiting needs to factor in the client's completion.

And one product issue:
1. The timeout handling we added in .NET 5 has a race condition that can, in extreme conditions (e.g. a unit test trying to force the interleaving) result in us not throwing the appropriate TimeoutException.  There are two different timer mechanisms used, the one based on Timer inside of CancellationTokenSource, and then the use of Envrionment.TickCount64 to track how much time has progressed.  Their quantums are different, so it's possible for the timer to fire due to the timeout expiring but Environment.TickCount64 not ticking over, at which point we'll read it and determine that the cause of the cancellation wasn't for a timeout.  The fix is to use the pending requests token source to check whether it or the timeout occurred, rather than using the time elapsed to determine which of the two occurred.
@wfurt
Copy link
Member

wfurt commented Nov 3, 2020

/azp run runtime-libraries-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@wfurt wfurt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stephentoub stephentoub merged commit 66a623e into dotnet:master Nov 3, 2020
@stephentoub stephentoub deleted the fixhctimeout branch November 3, 2020 10:55
@ghost ghost locked as resolved and limited conversation to collaborators Dec 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GetAsync_ContentCanBeCanceled test is failing in CI
3 participants