Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage after await (with sample repo) #3344

Closed
Tratcher opened this issue Jul 19, 2018 · 17 comments
Closed

High CPU usage after await (with sample repo) #3344

Tratcher opened this issue Jul 19, 2018 · 17 comments
Assignees

Comments

@Tratcher
Copy link
Member

From @daddyman on July 19, 2018 21:47

Since moving to .NET Core 2.1 we have been seeing our aspnet core web api spike to 99% CPU usage on Linux. This does not happen when running on Windows.

What we observe in our application is if we make 4 or 5 API calls in quick succession then the CPU goes to 99% and stays there for a few minutes.

I was able to strip down our app and reproduce the issue in a simple app. The repo is here: https://github.com/daddyman/dotnetcorecpu

I am running .NET Core 2.1.2 using SDK 2.1.302 on Ubuntu 16.04.

It seems to have something to do with "await". In the reproduction case I have one controller that handles a POST.

  • Performing an await Task.CompletedTask; does not cause a problem.
  • Performing an away DummyWait(); where DummyWait is return Task.Delay(1); causes high CPU after hitting the server with many requests in a short time. The requests complete quickly but then the CPU goes to 99% and stays there.

In our real app I did the same change of using the await Task.CompletedTask; and 'await Task.DummyWait();` and saw the same difference with the CPU usage.

Copied from original issue: dotnet/corefx#31208

@Tratcher
Copy link
Member Author

From @davidsh on July 19, 2018 21:53

cc: @Tratcher

@Tratcher
Copy link
Member Author

From @joshfree on July 19, 2018 21:57

@brianrob could you PTAL and help route?

@daddyman
Copy link

I updated the test app. Using a Task.Delay(100); makes the CPU spike happen quicker. I am guessing that causes the await to actually need to "wait" more often than the Task.Delay(1);.

In our app this is where we access the database using an await call which is going to need "wait" almost every time. Changing that code to do await Task.CompletedTask; never causes the CPU spike.

@beevik
Copy link

beevik commented Jul 20, 2018

I have a (more complicated) application that is seeing similar 100% CPU spikes that occur on awaited calls. I never see the problem when running directly on Windows. I see it only when running in a docker container (which is built on top of the microsoft/dotnet:2.1-aspnetcore-runtime container).

I see the problem with SDK 2.1.301 and runtime 2.1.1.

When I downgrade to SDK 2.1.4 and runtime 2.0.5, the problem goes away.

@Eilon
Copy link
Member

Eilon commented Jul 20, 2018

@davidfowl - does ASP.NET Core do anything at all that might affect this?

@brianrob
Copy link
Member

FYI, I'm also looking at this, but haven't been able to repro it thus far.

@Eilon
Copy link
Member

Eilon commented Jul 20, 2018

Thanks @brianrob , please do post anything you find out, even if it's nothing 😄

@beevik
Copy link

beevik commented Jul 20, 2018

A follow-up in response to @daddyman's comment:

In our app this is where we access the database using an await call which is going to need "wait" almost every time.

This is the case with me too. The await that triggers the spike in my code is on a call to the StackExchange.Redis SortedSetIncrementAsync (ZINCRBY) call. If I replace that with a synchronous call, the spike inevitably starts on a different await call.

Another piece of information: Once the 100% CPU spike hits, the await call that triggered it blocks for exactly 2 minutes so long as my application performs no additional awaited calls (via other controllers). Once the 2 minutes elapse, the CPU drops back to near-zero and the awaited call returns. The 2 minute-long spike is consistent; it's never more or less than 2 minutes unless I issue another awaited call from a different controller.

edit: I should add that the blocking await call that triggers the CPU spike has in fact successfully updated the database -- almost immediately -- with the transaction it was attempting to complete. The awaited call should have returned right away;; it's as if the CPU spinning is tied to the call's inability to return. At first I thought this might be a problem with the StackExchange.Redis library. But since @daddyman is also seeing something similar without using that library, it makes me think the problem might be .NET core 2.1.

@daddyman
Copy link

@brianrob When using the test app I posted and the await is set to use Task.Delay(100); I see it happen when I have Insomnia up and spam the server by pressing Send very quickly multiple times. Sometimes I see if happen after just 3 or 4 sends and sometimes I have a do it over and over for 20 or 30 seconds.

I tried to use curl to reproduce but it never triggered the 99% CPU.

@brianrob
Copy link
Member

Thanks @daddyman, I can now reproduce this. I've just collected a trace and am looking at it now.

@brianrob
Copy link
Member

I've had a chance to repro this and try to understand the behavior a bit. What I see are after "enough" POST requests, Pairs of threads will saturate CPUs. Most commonly, I saw just a single pair (2) threads saturate two CPUs, but I did once encounter two pairs (4) threads saturate four CPUs.

Looking at the trace, the functions where the bulk of the CPU time are spent are:
image

Drilling into Pipe::ReadAsync shows that there is contention on this stack as well as a bunch of CPU time:
image

If I look at the time spent in managed code Monitors, it is coming from two methods (ReadAsync and AdvanceReader):
image

Note that this only seems to occur when I attempt to create contention by sending multiple requests at the same time, but the spike lasts for around a minute on my machine. The time spent in the spike seems to be attempting to process one or more requests and is slowly chipping away at them (with some contention).

It looks like this investigation should be continued by folks who have some familiarity with Pipelines.

@Tratcher
Copy link
Member Author

Tratcher commented Jul 20, 2018

@pakrym
@halter73

@Tratcher
Copy link
Member Author

Related? aspnet/KestrelHttpServer#2694

@ZOXEXIVO
Copy link

@Tratcher, yes, we have similar problems

@beevik
Copy link

beevik commented Jul 21, 2018

@Tratcher Doing UseLibuv() in our web host builder chain also fixed my problem.

@daddyman
Copy link

As @beevik mentioned switching to using libuv as the transport fixed the problem for us.

I got information on switching back to libuv from here: https://blogs.msdn.microsoft.com/webdev/2018/04/12/asp-net-core-2-1-0-preview2-improvements-to-the-kestrel-http-server/

@pakrym
Copy link
Contributor

pakrym commented Jul 23, 2018

I'm closing this in favour of aspnet/KestrelHttpServer#2694

@pakrym pakrym closed this as completed Jul 23, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Dec 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants