[Perf -26%] System.Tests.Perf_Random (3) #47870

DrewScoggins · 2021-02-04T19:43:36Z

Run Information

Architecture	x64
OS	Windows 10.0.18362
Baseline	18645f9534e16fdf99cf2ff4904901bc4e8e336e
Compare	f80b0575d44ccdc85a18c8dc2dbdcd29f7788d0b
Diff	Link

Regressions in System.Tests.Perf_Random

Benchmark	Baseline	Test	Test/Base	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
Next_int_int	12.15 ns	15.12 ns	1.24				Trace	Trace
Next_int	12.16 ns	15.03 ns	1.24	63.392373156766105	68.50399535495517	1.0806346559316888	Trace	Trace
NextDouble	10.50 ns	13.80 ns	1.31	48.0352881089866	52.731103247262546	1.0977576136864564	Trace	Trace

Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Tests.Perf_Random*'

.

Payloads

Baseline
Compare

Histogram

System.Tests.Perf_Random.Next_int_int

[11.916 ; 12.382) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[12.382 ; 12.843) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[12.843 ; 13.335) | @@@@@@@@@
[13.335 ; 13.761) | @@
[13.761 ; 14.248) | 
[14.248 ; 14.734) | 
[14.734 ; 15.193) | @@@@@@@@@@@@@@
[15.193 ; 15.774) | @@@@@@@@@@@

Compare Jit Disasm

; System.Tests.Perf_Random.Next_int_int()
       mov       rcx,[rcx+8]
       mov       edx,64
       mov       r8d,2710
       mov       rax,[rcx]
       mov       rax,[rax+40]
       mov       rax,[rax+30]
       jmp       rax
; Total bytes of code 29

System.Tests.Perf_Random.Next_int

[12.112 ; 12.359) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[12.359 ; 12.648) | @@@
[12.648 ; 13.158) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[13.158 ; 13.594) | 
[13.594 ; 14.134) | @
[14.134 ; 14.667) | @
[14.667 ; 15.193) | @@@@@@@@@@@@@@@@@@
[15.193 ; 16.013) | @@@@@@@

Baseline Jit Disasm

; System.Tests.Perf_Random.Next_int()
       mov       rcx,[rcx+8]
       mov       edx,2710
       mov       rax,[rcx]
       mov       rax,[rax+40]
       mov       rax,[rax+30]
       jmp       rax
; Total bytes of code 23

Compare Jit Disasm

; System.Tests.Perf_Random.Next_int()
       mov       rcx,[rcx+8]
       mov       edx,2710
       mov       rax,[rcx]
       mov       rax,[rax+40]
       mov       rax,[rax+28]
       jmp       rax
; Total bytes of code 23

System.Tests.Perf_Random.NextDouble

[10.135 ; 10.562) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[10.562 ; 10.903) | @@@@@@@
[10.903 ; 11.160) | @@@@@
[11.160 ; 11.417) | 
[11.417 ; 11.674) | 
[11.674 ; 11.931) | 
[11.931 ; 12.188) | 
[12.188 ; 12.445) | 
[12.445 ; 12.702) | 
[12.702 ; 13.067) | 
[13.067 ; 13.570) | @@@@@@@@@@@@@@@@@@@@
[13.570 ; 13.881) | @@
[13.881 ; 14.619) | @@

Baseline Jit Disasm

; System.Tests.Perf_Random.NextDouble()
       mov       rcx,[rcx+8]
       mov       rax,[rcx]
       mov       rax,[rax+48]
       mov       rax,[rax]
       jmp       rax
; Total bytes of code 17

Compare Jit Disasm

; System.Tests.Perf_Random.NextDouble()
       mov       rcx,[rcx+8]
       mov       rax,[rcx]
       mov       rax,[rax+48]
       mov       rax,[rax+18]
       jmp       rax
; Total bytes of code 18

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

The text was updated successfully, but these errors were encountered:

stephentoub · 2021-02-04T22:12:09Z

@DrewScoggins, at the same time these were shown to regress slightly, did other Random benchmarks improve? How can I see all of the benchmarks on this benchmark class for the same time period?

DrewScoggins · 2021-02-04T22:32:52Z

We did see improvements in the constructor as evidenced by the auto-filed issues below. You can also look at these index pages that have links to every tests full history. I will just use find in the browser to look for the tests that I care about.

Windows x64
Windows x86
Ubuntu x64
Ubuntu ARM64

DrewScoggins/performance-2#3846
DrewScoggins/performance-2#3851
DrewScoggins/performance-2#3863

stephentoub · 2021-02-04T22:36:50Z

Hmm. I assume this is due to my changes in 5ee25aa, but I saw these as improvements rather than regressions locally, and I would have expected massive improvements in benchmarks on NextBytes() and Next(), so something seems a bit off here.

DrewScoggins · 2021-02-05T00:05:40Z

Here are the results from running on my machine. They seem to match what we are seeing in the lab. Next_Double and Ctor got faster, but the other three got worse. I have also grabbed the links to the coreruns that were used and put them as links above the tables. I got this information from the details area of the issue.

Baseline

Method	Mean	Error	StdDev	Median	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
ctor_seeded	655.18 ns	20.283 ns	19.921 ns	657.28 ns	619.09 ns	699.49 ns	0.0640	-	-	280 B
ctor	2,153.59 ns	92.377 ns	102.677 ns	2,135.08 ns	1,986.38 ns	2,344.33 ns	0.0604	-	-	280 B
Next_int	15.29 ns	0.940 ns	1.045 ns	15.18 ns	13.53 ns	17.39 ns	-	-	-	-
Next_int_unseeded	13.90 ns	1.111 ns	1.235 ns	14.33 ns	11.38 ns	15.74 ns	-	-	-	-
Next_int_int	13.96 ns	0.840 ns	0.967 ns	13.71 ns	12.51 ns	15.83 ns	-	-	-	-
Next_int_int_unseeded	14.48 ns	0.743 ns	0.856 ns	14.48 ns	13.10 ns	16.00 ns	-	-	-	-
NextBytes	9,276.95 ns	380.486 ns	407.116 ns	9,297.49 ns	8,420.44 ns	10,071.75 ns	-	-	-	-
NextBytes_unseeded	10,747.47 ns	405.822 ns	451.070 ns	10,629.50 ns	10,074.97 ns	11,794.80 ns	-	-	-	-
NextDouble	11.12 ns	0.729 ns	0.839 ns	10.97 ns	10.15 ns	13.33 ns	-	-	-	-
NextDouble_unseeded	11.52 ns	0.903 ns	1.039 ns	11.29 ns	10.40 ns	13.93 ns	-	-	-	-
NextBytes_span	9,532.73 ns	189.283 ns	202.531 ns	9,499.73 ns	9,244.49 ns	9,970.05 ns	-	-	-	-
NextBytes_span_unseeded	9,813.42 ns	500.972 ns	576.920 ns	9,765.53 ns	9,079.85 ns	10,872.94 ns	-	-	-	-

Compare

Method	Mean	Error	StdDev	Median	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
ctor_seeded	507.186 ns	17.5872 ns	20.2534 ns	508.721 ns	472.523 ns	546.929 ns	0.0733	-	-	312 B
ctor	166.817 ns	7.7035 ns	8.8714 ns	166.989 ns	152.572 ns	179.911 ns	0.0167	-	-	72 B
Next_int	16.054 ns	0.5876 ns	0.6767 ns	16.029 ns	14.969 ns	17.407 ns	-	-	-	-
Next_int_unseeded	12.374 ns	0.6484 ns	0.7207 ns	12.092 ns	11.555 ns	13.890 ns	-	-	-	-
Next_int_int	15.801 ns	0.5816 ns	0.6465 ns	15.537 ns	15.028 ns	16.981 ns	-	-	-	-
Next_int_int_unseeded	12.435 ns	0.4619 ns	0.5134 ns	12.340 ns	11.724 ns	13.485 ns	-	-	-	-
NextBytes	8,650.211 ns	368.9268 ns	424.8566 ns	8,545.533 ns	8,156.941 ns	9,401.946 ns	-	-	-	-
NextBytes_unseeded	352.777 ns	14.1414 ns	15.7182 ns	346.657 ns	332.019 ns	383.492 ns	-	-	-	-
NextDouble	13.823 ns	0.7075 ns	0.7864 ns	13.553 ns	13.027 ns	15.922 ns	-	-	-	-
NextDouble_unseeded	3.744 ns	0.1713 ns	0.1904 ns	3.749 ns	3.380 ns	4.075 ns	-	-	-	-
NextBytes_span	10,647.569 ns	574.6784 ns	661.8006 ns	10,569.405 ns	9,756.100 ns	11,665.939 ns	-	-	-	-
NextBytes_span_unseeded	386.359 ns	15.9206 ns	18.3342 ns	382.575 ns	355.052 ns	419.783 ns	-	-	-	-

danmoseley · 2021-02-05T01:30:52Z

For the avoidance of doubt, "_unseeded" and "ctor" are the ones using the new algorithm. Below I've rearranged @DrewScoggins numbers into two tables, the seeded ones in first table (same algorithm) unseeded in second table (new algorithm) each in pairs of rows, old first then new.

The seeded ones vary, but there's a possible regression in some, perhaps due to the new indirection.

Method	Mean	Error	StdDev	Median	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
ctor_seeded	655.18 ns	20.283 ns	19.921 ns	657.28 ns	619.09 ns	699.49 ns	0.0640	-	-	280 B
ctor_seeded	507.186 ns	17.5872 ns	20.2534 ns	508.721 ns	472.523 ns	546.929 ns	0.0733	-	-	312 B
Next_int	15.29 ns	0.940 ns	1.045 ns	15.18 ns	13.53 ns	17.39 ns	-	-	-	-
Next_int	16.054 ns	0.5876 ns	0.6767 ns	16.029 ns	14.969 ns	17.407 ns	-	-	-	-
Next_int_int	13.96 ns	0.840 ns	0.967 ns	13.71 ns	12.51 ns	15.83 ns	-	-	-	-
Next_int_int	15.801 ns	0.5816 ns	0.6465 ns	15.537 ns	15.028 ns	16.981 ns	-	-	-	-
NextBytes	9,276.95 ns	380.486 ns	407.116 ns	9,297.49 ns	8,420.44 ns	10,071.75 ns	-	-	-	-
NextBytes	8,650.211 ns	368.9268 ns	424.8566 ns	8,545.533 ns	8,156.941 ns	9,401.946 ns	-	-	-	-
NextDouble	11.12 ns	0.729 ns	0.839 ns	10.97 ns	10.15 ns	13.33 ns	-	-	-	-
NextDouble	13.823 ns	0.7075 ns	0.7864 ns	13.553 ns	13.027 ns	15.922 ns	-	-	-	-
NextBytes_span	9,532.73 ns	189.283 ns	202.531 ns	9,499.73 ns	9,244.49 ns	9,970.05 ns	-	-	-	-
NextBytes_span	10,647.569 ns	574.6784 ns	661.8006 ns	10,569.405 ns	9,756.100 ns	11,665.939 ns	-	-	-	-

The unseeded ones I think have the improvements we expected

Method	Mean	Error	StdDev	Median	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
ctor	2,153.59 ns	92.377 ns	102.677 ns	2,135.08 ns	1,986.38 ns	2,344.33 ns	0.0604	-	-	280 B
ctor	166.817 ns	7.7035 ns	8.8714 ns	166.989 ns	152.572 ns	179.911 ns	0.0167	-	-	72 B
Next_int_unseeded	13.90 ns	1.111 ns	1.235 ns	14.33 ns	11.38 ns	15.74 ns	-	-	-	-
Next_int_unseeded	12.374 ns	0.6484 ns	0.7207 ns	12.092 ns	11.555 ns	13.890 ns	-	-	-	-
Next_int_int_unseeded	14.48 ns	0.743 ns	0.856 ns	14.48 ns	13.10 ns	16.00 ns	-	-	-	-
Next_int_int_unseeded	12.435 ns	0.4619 ns	0.5134 ns	12.340 ns	11.724 ns	13.485 ns	-	-	-	-
NextBytes_unseeded	10,747.47 ns	405.822 ns	451.070 ns	10,629.50 ns	10,074.97 ns	11,794.80 ns	-	-	-	-
NextBytes_unseeded	352.777 ns	14.1414 ns	15.7182 ns	346.657 ns	332.019 ns	383.492 ns	-	-	-	-
NextDouble_unseeded	11.52 ns	0.903 ns	1.039 ns	11.29 ns	10.40 ns	13.93 ns	-	-	-	-
NextDouble_unseeded	3.744 ns	0.1713 ns	0.1904 ns	3.749 ns	3.380 ns	4.075 ns	-	-	-	-
NextBytes_span_unseeded	9,813.42 ns	500.972 ns	576.920 ns	9,765.53 ns	9,079.85 ns	10,872.94 ns	-	-	-	-
NextBytes_span_unseeded	386.359 ns	15.9206 ns	18.3342 ns	382.575 ns	355.052 ns	419.783 ns	-	-	-	-

danmoseley · 2021-02-05T01:33:15Z

Note we have no perf test for Next() .. adding one dotnet/performance#1649

danmoseley · 2021-02-05T01:33:46Z

What did it look like on 32 bit, Drew? That implementation is different.

stephentoub · 2021-02-05T01:38:19Z

For the avoidance of doubt, "_unseeded" and "ctor" are the ones using the new algorithm

Ohhh! That makes much more sense then.

I'm fine accepting the small regressions in the seeded ones; we knew about those, and I don't think they matter. It's good to have the tests to catch major unexpected regressions, but if someone is using the seeded ctor they're either a) just trying to provide better randomness via their own seed and should stop doing so, or b) trying to get reproducibility in which case it's most likely in a test or something where a small regression isn't a big deal.

danmoseley · 2021-02-05T01:53:44Z

OK fine.

DrewScoggins added os-windows tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark arch-x64 labels Feb 4, 2021

dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Feb 4, 2021

stephentoub added area-System.Runtime and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Feb 4, 2021

danmoseley closed this as completed Feb 5, 2021

stephentoub mentioned this issue Feb 9, 2021

[MONO][PERF] 28-32% Pefromance Regression in System.Tests.Perf_Random #48087

Closed

ghost locked as resolved and limited conversation to collaborators Mar 7, 2021

tannergooding removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf -26%] System.Tests.Perf_Random (3) #47870

[Perf -26%] System.Tests.Perf_Random (3) #47870

DrewScoggins commented Feb 4, 2021

Payloads

Histogram

System.Tests.Perf_Random.Next_int_int

Compare Jit Disasm

System.Tests.Perf_Random.Next_int

Baseline Jit Disasm

Compare Jit Disasm

System.Tests.Perf_Random.NextDouble

Baseline Jit Disasm

Compare Jit Disasm

Docs

stephentoub commented Feb 4, 2021

DrewScoggins commented Feb 4, 2021

stephentoub commented Feb 4, 2021

DrewScoggins commented Feb 5, 2021 •

edited

Loading

danmoseley commented Feb 5, 2021

danmoseley commented Feb 5, 2021 •

edited

Loading

danmoseley commented Feb 5, 2021

stephentoub commented Feb 5, 2021

danmoseley commented Feb 5, 2021

[Perf -26%] System.Tests.Perf_Random (3) #47870

[Perf -26%] System.Tests.Perf_Random (3) #47870

Comments

DrewScoggins commented Feb 4, 2021

Run Information

Regressions in System.Tests.Perf_Random

Repro

Payloads

Histogram

System.Tests.Perf_Random.Next_int_int

Compare Jit Disasm

System.Tests.Perf_Random.Next_int

Baseline Jit Disasm

Compare Jit Disasm

System.Tests.Perf_Random.NextDouble

Baseline Jit Disasm

Compare Jit Disasm

Docs

stephentoub commented Feb 4, 2021

DrewScoggins commented Feb 4, 2021

stephentoub commented Feb 4, 2021

DrewScoggins commented Feb 5, 2021 • edited Loading

danmoseley commented Feb 5, 2021

danmoseley commented Feb 5, 2021 • edited Loading

danmoseley commented Feb 5, 2021

stephentoub commented Feb 5, 2021

danmoseley commented Feb 5, 2021

DrewScoggins commented Feb 5, 2021 •

edited

Loading

danmoseley commented Feb 5, 2021 •

edited

Loading