Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in connections creation performance #61233

Closed
Tracked by #39846
sebastienros opened this issue Nov 4, 2021 · 10 comments · Fixed by #64770
Closed
Tracked by #39846

Regression in connections creation performance #61233

sebastienros opened this issue Nov 4, 2021 · 10 comments · Fixed by #64770
Assignees
Milestone

Comments

@sebastienros
Copy link
Member

sebastienros commented Nov 4, 2021

I detected a regression due to this PR #53340 on "connection close" scenarios where we measure the speed of creating connections. It was discovered while investigating ssl handshakes performance comparisons with HttpSys. We discovered the difference was not due to the handshake itself.

On a 12 v-cores windows machines (aspnet-perf-win) it reduces RPS from 14K to 5K and latency goes from 2ms to 6ms. The load generator sends a Connection: close header to force new connections on every request. Can also be measured on Linux.

I can gather traces for the runs before and after the change. Let me know if you want to see ThreadTime/NetworkTCPIP in these since it impacts the traces size and perf differences. These benchmarks are not CPU bound and show a lot of BLOCKED time. The NetworkTCPIP stack might be unnecessary at this level.

@karelz @halter73 @geoffkizer

@sebastienros sebastienros added the tenet-performance Performance related issue label Nov 4, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Nov 4, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Nov 5, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

I detected a regression due to this PR #53340 on "connection close" scenarios where we measure the speed of creating connections. It was discovered while investigating ssl handshakes performance comparisons with HttpSys. We discovered the difference was not due to the handshake itself.

On a 12 v-cores windows machines (aspnet-perf-win) it reduces RPS from 14K to 5K and latency goes from 2ms to 6ms. The load generator sends a Connection: close header to force new connections on every request. Can also be measured on Linux.

I can gather traces for the runs before and after the change. Let me know if you want to see ThreadTime/NetworkTCPIP in these since it impacts the traces size and perf differences. These benchmarks are not CPU bound and show a lot of BLOCKED time. The NetworkTCPIP stack might be unnecessary at this level.

@karelz @halter73 @geoffkizer

Author: sebastienros
Assignees: -
Labels:

area-System.Net.Sockets, tenet-performance, untriaged

Milestone: -

@geoffkizer
Copy link
Contributor

Wow, that's a big regression. The linked PR did shuffle around some of the internal handling here, so perhaps that introduced an issue.

Since we are seeing a lot of BLOCKED time, it sounds like we have some sort of contention issue. Can you tell from the traces where the contention is occurring? I looked at the code, but don't see anything obvious here; there are a bunch of interlocked ops but no explicit blocking that I see.

@karelz karelz added this to the 7.0.0 milestone Nov 5, 2021
@karelz karelz added bug and removed untriaged New issue has not been triaged by the area owner labels Nov 5, 2021
@karelz
Copy link
Member

karelz commented Nov 5, 2021

Given that this is non-trivial regression, we should fix it sooner in 7.0 and then consider backporting it to 6.0 based on fix risk and assessed + observed impact on customers.

@geoffkizer
Copy link
Contributor

I did notice one issue that could cause increased memory usage. Fix is here: #61258

@sebastienros Can you try that out and see if it helps?

This seems like a small issue so I doubt it would be the root cause here, but it's worth a shot.

@sebastienros
Copy link
Member Author

Crank command lines to repro and get traces:

Before (build 6.0.0-preview.6.21279.1+ff72cc9)

crank --config https://github.com/aspnet/Benchmarks/blob/main/src/BenchmarksApps/HelloWorldMiddleware/benchmarks.yml?raw=true --scenario connectionclose --profile aspnet-perf-win --application.framework net6.0 --chart --variable warmup=3 --variable duration=8 --variable connections=32  --application.aspnetcoreversion 6.0.0-preview.6.21279.2 --application.runtimeversion 6.0.0-preview.6.21279.1 --application.sdkVersion 6.0.100-preview.6.21276.12 --application.options.collectCounters true --application.collect true

After (build 6.0.0-preview.6.21279.3+b5c91e4)

crank --config https://github.com/aspnet/Benchmarks/blob/main/src/BenchmarksApps/HelloWorldMiddleware/benchmarks.yml?raw=true --scenario connectionclose --profile aspnet-perf-win --application.framework net6.0 --chart --variable warmup=3 --variable duration=8 --variable connections=32  --application.aspnetcoreversion 6.0.0-preview.6.21279.2 --application.runtimeversion 6.0.0-preview.6.21279.3 --application.sdkVersion 6.0.100-preview.6.21276.12 --application.options.collectCounters true --application.collect true

To remove ThreadTime from traces as it makes them much bigger, use this argument and remove +ThreadTime

--application.collectArguments "BufferSizeMB=1024;CircularMB=4096;TplEvents=None;Providers=Microsoft-Diagnostics-DiagnosticSource:0:0;KernelEvents=default+ThreadTime-NetworkTCPIP"

To test local changes add this argument:

--application.options.outputFiles "c:\build\socketsdll-release\*.*"

Note that collecting a trace impacts the RPS significantly, so to compare local changes impact I recommend to disable it.

@geoffkizer
Copy link
Contributor

I did a run with #61258 and it does not seem to be the root cause here. Further investigation is needed.

@geoffkizer
Copy link
Contributor

I did some runs on Linux (aspnet-perf-lin) and I see bimodal behavior there:

Before   After  
req/s CPU req/s CPU
12844 96 17217 92
6715 51 6659 49
14583 93 20078 91
15310 93 6616 50
6601 50 15051 93

This is weird and should be investigated further, but it's not a regression.

Windows results seem to be consistent, with Before around 17k and After around 6-7k. However, these results are similar enough to the two modes above that I wonder if something inadvertantly triggered the Windows behavior to switch from being consistently in the good mode, to consistently in the bad mode.

More investigation is needed here.

@karelz
Copy link
Member

karelz commented Nov 9, 2021

Triage: The PR shouldn't have caused this type of regression.
Linux was bimodal even before the PR, Windows after the PR is always "bad", This is weird, might be test issue, ASP.NET issue or Runtime issue -- we need to learn more.
Warning: Depends on HW.

Next step: Create micro-benchmark here (e.g. remove ASP.NET from the repro).

@CarnaViire
Copy link
Member

I am able to reproduce that on pure socket code (on crank); the code is here aspnet/Benchmarks@main...CarnaViire:socket-benchmark

6.0.0-preview.6.21279.1: Mean RPS = 15,479

crank --config https://raw.githubusercontent.com/CarnaViire/Benchmarks/socket-benchmark/src/BenchmarksApps/Sockets/sockets.benchmarks.yml --scenario http --profile aspnet-perf-win --server.aspnetcoreversion 6.0.0-preview.6.21279.2 --server.runtimeversion 6.0.0-preview.6.21279.1 --server.sdkVersion 6.0.100-preview.6.21276.12

6.0.0-preview.6.21279.3: Mean RPS = 8,400

crank --config https://raw.githubusercontent.com/CarnaViire/Benchmarks/socket-benchmark/src/BenchmarksApps/Sockets/sockets.benchmarks.yml --scenario http --profile aspnet-perf-win --server.aspnetcoreversion 6.0.0-preview.6.21279.2 --server.runtimeversion 6.0.0-preview.6.21279.3 --server.sdkVersion 6.0.100-preview.6.21276.12

7.0.0-alpha.1.21567.1 (should already include #61258): Mean RPS = 8,776

crank --config https://raw.githubusercontent.com/CarnaViire/Benchmarks/socket-benchmark/src/BenchmarksApps/Sockets/sockets.benchmarks.yml --scenario http --profile aspnet-perf-win --server.aspnetcoreversion 7.0.0-alpha.1.21567.15 --server.runtimeversion 7.0.0-alpha.1.21567.1 --server.sdkVersion 7.0.100-alpha.1.21568.2

Reproes only on Windows, as was confirmed before; but I don't see any "bimodal" behavior on Linux, the performance there is stable and doesn't change much between the versions. I will be investigating further.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 3, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 8, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants