Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve io thread scheduler #4862

Merged
merged 3 commits into from
Jul 28, 2022
Merged

Conversation

mconnew
Copy link
Member

@mconnew mconnew commented Jul 22, 2022

SynchronizationContext.Current doesn't flow through an asynchronously completed await. The existing code only worked for scheduling code to an IO thread up until the second await which completed asynchronously. The first await correctly schedules code to run on using the sync context, but then Current would be null. This meant any subsequent await's would use the default synchronization context which means the continuation happened on the thread pool. This change creates a TaskScheduler based on the synchronization context and executes the code through that. This way the synchronization context is ignored (so a null value doesn't matter) and continuations correctly execute using the IOThreadScheduler.
There was also a bug where we could block the IOThreadScheduler dispatch thread as we were dispatching code which effectively called Task.Wait() to the IOThreadScheduler. This was only not a problem because of the previously mentioned bug where continuations were being scheduled to the thread pool which meant the Task.Wait() would complete. This should improve performance as now we don't block the IOThreadScheduler with a Wait() call so any other code queued up on the scheduler will progress quicker.

@mconnew
Copy link
Member Author

mconnew commented Jul 28, 2022

This has been verified by vendors to not cause any regressions. An internal dependent team has validated this improves the responsiveness of WCF when opening synchronously while experiencing thread pool contention. Merging because of this extra validation.

@mconnew mconnew merged commit 1f11516 into dotnet:main Jul 28, 2022
@scombroid
Copy link

Hi @mconnew ,

Our team recently upgraded a .net core 3.1 api to .net6, under the hood the api makes multiple wcf calls to another system.
Since the upgrade to .net6, we noticed significant slow responses when the api is under load.
We have been chasing a performance issue in our test environment, at the end we've narrowed the issue down wcf.
using dotnet-counters, we were able to see that the threadpool struggled badly when the api is under load - we were doing 100 request per second.
Interestingly, when we kept the api traffic consistently at 100 per second, the threadpool eventually settled down (after ~2mins of excessive slow responses)
Comparing the exact same load test in .net core 3.1, (in dotnet-counters) we did notice a very short moment of threadpool struggle, however it recovered very quickly (unlike in .net 6).

We also tried upgrading to the preview version "4.10.0-preview1.22261.2", but still ran into the same issue.

Today we saw this PR of yours from 3 days ago! We took the main branch and built System.ServiceModel.xx dlls, and the issue is now fixed!
We are writing this message to thank you for the fix!! Esp. .net core 3 LTS is ending this december, the issue has caused a lot of anxiety in the team.

A couple of questions:

  1. Is there a reason on why the issue is exacerbated in .net 6?
  2. We would like to know when will this fix be included in a release?

@mconnew
Copy link
Member Author

mconnew commented Aug 1, 2022

I don't know why this issue is worse in .NET 6. I believe the team who asked for this were experiencing the problem on .NET Core 3.1. There is another change I didn't explicitly call out was around staying async if not calling external code. The problem we have is we need to not break custom binding elements and need to maintain expected behavior. If you have a custom channel layer and call Open() on the channel, there is an expectation that Open() is called on the custom channel layer and not BeginOpen. The original code when you called Open(), it would call the internal OpenAsync() and continue on another thread. When that code needed to open the next channel layer, it would call Open() again as the original call was synchronous. This meant when you had several layers all using inbox channel layers, you would consume a thread calling Open() which then blocked waiting for OpenAync() to complete, one time for each layer. The change I made was to detect the next layer is an inbox implementation and pass on the info that Open() was originally called synchronously, but then call OpenAsync() as we know it can support that (and it won't be a surprise hitting an unimplemented code path). This means you only block the single original thread which calls Open() synchronously.

I don't know what changed in .NET 6 to make this problem worse. It either needs more concurrent threads to do the same amount of work, or the threadpool doesn't grow as easily as in earlier versions.

This is only an issue when you call WCF synchronously. There are a couple of suggestions I have. If you share a channel between multiple threads (call CreateChannel() and have multiple threads use that channel at the same time), always open the channel before first use (cast to ICommunicationObject). Open it asynchronously. You can do that like this:

var channel = factory.CreateChannel();
var commObj = (ICommunicationObject)channel;
await Task.Factory.FromAsync(commObj.BeginOpen, commObj.EndOpen, null);

Otherwise there's a behavior to ensure in order calling of requests (requests go out on the wire in the same order that the calls were made) which causes an issue. Basically the first request detects the channel needs to be opened and places the actual request into a queue and calls open. Once the channel is opened, it starts draining the queue making one request at a time. When a new request is made, if that queue isn't empty, it adds itself onto the end of the queue. This means if you initially have a lot of requests happening concurrently using a single channel, you can end up making the requests one at a time until the queue is empty. This is an easy situation to get into if you had some thread contention when you made the first call and you share a channel instance. So basically always explicitly open your channel if there's any sharing between threads.

The next suggestion is to ensure your contract is only using async operations. The client and server don't need to match, so if your service implementation is sync, you client can still be async. As long as the name of the operation is compatible. This means the method Foo Bar(int param) is compatible with Task<Foo> BarAsync(int param). Sync vs async is about the local call pattern and doesn't change anything on the wire.

If you are using a generated client, enabled channel factory caching. If your client type is called CalculatorServiceClient, then before any instantiations of the type, call CalculatorServiceClient.CacheSetting = CacheSetting.AlwaysOn;. This shares the ChannelFactory instance between instances.

The final hammer you can throw at the problem is to increase the MinThreads on the thread pool so that it ramps up quicker. Work out how many worker threads you typically use at stable state, then have this code at the start of your program:

int workerThread, ioThreads;
ThreadPool.GetMinThreads(out workerThreads, out completionPortThreads);
workerThreads = typicalWorkerThreadsMeasured;
ThreadPool.SetMinThreads(workerThreads, completionPortThreads);

You can find out the number of threads you currently have with this code:

int workerThread, ioThreads;
GetAvailableThreads(out workerThreads, out  completionPortThreads);

Or you can take a dump and in windbg use the sos command !threadpool. This removes the warmup period and allows the thread pool to immediately ramp up to the configured number of threads if needed. You might also need to increase the completion port threads too as async IO completes on completion port threads, but the issue this PR fixed isn't affected by the completion port thread limit and that would be a general scale up issue related to doing lots of IO.

Edit: Just realized I didn't answer your final question. This fix will be in .NET 7. We are almost certain to release another preview/rc release with these changes well before then.

@scombroid
Copy link

Thanks heaps for the detailed reply, and advice.

We rolled back our wcf dll to the current release v4.9.0.
We added async open, with just the async open did not solve the issue, so we thought we will also change the close to async close, turned out having both async open & close solved our performance issue entirely!

The changes we made:

  1. We added
var commObj = (ICommunicationObject)clientChannel;
await Task.Factory.FromAsync(commObj.BeginOpen, commObj.EndOpen, null);
  1. Also change from clientChannel.Close(); to await Task.Factory.FromAsync(commObj.BeginClose, commObj.EndClose, null);

p/s
What a crazy & strange coincidence that our team ran into this issue and been struggling with it for most of last week. We have been reading/inspecting dotnet & wcf release notes / PRs.. basically anything that are related to threadpool & wcf......
And then your fix came in at the end of the week.
Our team came into work on monday and saw a closed PR (merged to master) with the title ringing all the bells ...... quickly pull / build / deploy / test .... and bingo - that was the fix we're looking for!!

Our team from Brisbane, Australia is super thankful for the fix & your advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants