Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf -22%] System.Collections.IterateForEach<Int32>.ConcurrentBag #39108

Closed
DrewScoggins opened this issue Jul 10, 2020 · 18 comments
Closed

[Perf -22%] System.Collections.IterateForEach<Int32>.ConcurrentBag #39108

DrewScoggins opened this issue Jul 10, 2020 · 18 comments
Assignees
Labels
arch-x64 area-System.Collections os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@DrewScoggins
Copy link
Member

DrewScoggins commented Jul 10, 2020

Run Information

Architecture x64
OS ubuntu 18.04
Changes diff

Regressions in System.Collections.IterateForEach

Benchmark Baseline Test Test/Base Modality Baseline Outlier
ConcurrentBag 2.04 μs 2.47 μs 1.21 Bimodal False

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.IterateForEach<Int32>*';

Histogram

System.Collections.IterateForEach.ConcurrentBag(Size: 512)

[1834.841 ; 1917.204) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[1917.204 ; 2008.146) | @
[2008.146 ; 2090.509) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2090.509 ; 2192.475) | @@@
[2192.475 ; 2254.646) | 
[2254.646 ; 2337.009) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2337.009 ; 2412.240) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[2412.240 ; 2497.708) | @@@@@@@

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@DrewScoggins DrewScoggins added os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark arch-x64 labels Jul 10, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Collections untriaged New issue has not been triaged by the area owner labels Jul 10, 2020
@ghost
Copy link

ghost commented Jul 10, 2020

Tagging subscribers to this area: @eiriktsarpalis
Notify danmosemsft if you want to be subscribed.

@eiriktsarpalis eiriktsarpalis added this to the 5.0.0 milestone Jul 13, 2020
@danmoseley
Copy link
Member

The regression is apparently at the same time as #39112 and its friends, but as far as I can tell this scenario does not involve Array.Copy. There is one copy of the backing array (there should be only one since it's a single threaded benchmark) to set up the enumerator, but that is done by hand, then enumeration is indexing into it.

It is odd though that copying arrays is involved, and the rgression is at the same time.

@danmoseley
Copy link
Member

I don't have an Ubuntu machine handy at the moment, but confirmed there's no regression on Windows:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.421 (2004/?/20H1)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.100-rc.1.20404.3
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.40203, CoreFX 5.0.20.40203), X64 RyuJIT
Job-RPUQNW : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), X64 RyuJIT
Job-TTAEBI : .NET Core 5.0.0 (CoreCLR 5.0.20.40203, CoreFX 5.0.20.40203), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:DebugType=portable IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1

Method Job Runtime Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag Job-RPUQNW .NET Core 3.1 netcoreapp3.1 512 2.277 us 0.0351 us 0.0274 us 2.282 us 2.231 us 2.332 us 1.00 0.00 0.4990 - - 2.05 KB
ConcurrentBag Job-TTAEBI .NET Core 5.0 netcoreapp5.0 512 2.255 us 0.0318 us 0.0297 us 2.250 us 2.188 us 2.297 us 0.99 0.02 0.5024 - - 2.05 KB

@DrewScoggins
Copy link
Member Author

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 2.023 us 0.0053 us 0.0041 us 2.022 us 2.019 us 2.034 us 0.3317 - - 2.05 KB
Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 2.138 us 0.0036 us 0.0030 us 2.138 us 2.134 us 2.145 us 0.3339 - - 2.05 KB

Just ran them on a lab machine and this is what I am seeing. It's about a 7% regression, so smaller than we were seeing before, but it does seem to be statistically significant.

@danmoseley
Copy link
Member

That's on Ubuntu 18.04? And Windows looks like no regression for you?

@DrewScoggins
Copy link
Member Author

Yes. No regression on Windows.

@DrewScoggins
Copy link
Member Author

That is the right link. Since it is not in the regression list that means we likely have dropped down in the timing of this test, and looking at this graph, that seems to be the case. This came from this report. You can see we have started to see a little more volatility in that test.

image

Looking at the results from the runtime repo, I see that this test in general seems to have a bimodal state between 2200ns and 2400ns.

image

I am currently working on a feature that will show the entire history of a test, updated daily, that will be linked in these issues so that in cases like this you can see the current state of the test without having to click around through the full report site.

@danmoseley
Copy link
Member

OK thanks. I find it pretty hard to navigate around https://aka.ms/dotnetperfindex . I found the first graph above but not the second. I opened what I think is the monthly report and it's not there
https://pvscmdupload.blob.core.windows.net/reports/08_01_2020/Libraries/report_Monthly_ca=x64_cb=master_co=Ubuntu1804_cr=dotnetcoresdk_cf=Libraries_cc=CompliationMode=tiered-RunKind=micro_2020-08-01.html

@danmoseley
Copy link
Member

I wanted to get a graph with 3.1 on as well so we could see whether it was bimodal then and whether the modes overlapped.

@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Aug 31, 2020
@jeffhandley
Copy link
Member

@layomia @eiriktsarpalis Should we move this to 6.0.0?

@eiriktsarpalis
Copy link
Member

Ran it a few times on my Ubuntu 20.04 VM and I'm not seeing a significant difference in the numbers:

scripts/benchmarks_ci.py -f netcoreapp3.1 netcoreapp5.0 --filter 'System.Collections.IterateForEach<Int32>.ConcurrentBag'

.NET Core 3.1.6

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 2.997 us 0.0177 us 0.0166 us 2.993 us 2.973 us 3.035 us 0.0239 - - 2.05 KB

.NET 5

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 2.987 us 0.0139 us 0.0109 us 2.987 us 2.970 us 3.009 us 0.0238 - - 2.05 KB

@jeffhandley
Copy link
Member

Cool -- I will go ahead and close this then. @DrewScoggins, do you see any action items here on the infrastructure side?

@DrewScoggins
Copy link
Member Author

I am still seeing this regression on physical hardware. Would it be possible to try and run these on some physical hardware? I have pasted below the specs of the machine that we are running on, maybe we could try some different hardware to see if this is something directly linked to the processor.

5.0

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 2.377 us 0.0311 us 0.0291 us 2.376 us 2.349 us 2.448 us 0.3273 - - 2.05 KB

3.1

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 2.056 us 0.0023 us 0.0022 us 2.056 us 2.052 us 2.060 us 0.3289 - - 2.05 KB
BenchmarkDotNet=v0.12.1, OS=ubuntu 18.04
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores

@eiriktsarpalis
Copy link
Member

I am seeing a ~8% regression when run on physical hardware:

 BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 20.04
 Intel Core i5-6260U CPU 1.80GHz (Skylake), 1 CPU, 4 logical and 2 physical cores

.NET Core 3.1

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 3.594 us 0.1785 us 0.1984 us 3.525 us 3.389 us 4.043 us 0.9988 - - 2.05 KB

.NET 5

Method Size Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
ConcurrentBag 512 3.895 us 0.0675 us 0.0598 us 3.897 us 3.800 us 4.003 us 0.9977 - - 2.05 KB

@eiriktsarpalis
Copy link
Member

There is one copy of the backing array (there should be only one since it's a single threaded benchmark) to set up the enumerator, but that is done by hand, then enumeration is indexing into it.

@danmosemsft curious, what do you mean when you say it is done by hand?

@eiriktsarpalis
Copy link
Member

Moving to 6.0.0.

@eiriktsarpalis eiriktsarpalis modified the milestones: 5.0.0, 6.0.0 Sep 30, 2020
@eiriktsarpalis
Copy link
Member

Here are more recent results for the same benchmark. It seems like the regression was mitigated sometime back in November:

image

@ghost ghost locked as resolved and limited conversation to collaborators Sep 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x64 area-System.Collections os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

7 participants