-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regressions in System.Numerics.Tests #81766
Comments
Run Information
Regressions in System.Numerics.Tests.Perf_Vector3
Reprogit clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Vector3*' PayloadsHistogramSystem.Numerics.Tests.Perf_Vector3.TransformByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Vector3.TransformByQuaternionBenchmark()
sub rsp,98
vzeroupper
vmovaps [rsp+80],xmm6
vmovaps [rsp+70],xmm7
vmovaps [rsp+60],xmm8
vmovaps [rsp+50],xmm9
vmovaps [rsp+40],xmm10
vmovaps [rsp+30],xmm11
vmovaps [rsp+20],xmm12
vmovaps [rsp+10],xmm13
vmovaps [rsp],xmm14
vmovups xmm0,[7FF9AB2D5400]
vmovaps xmm1,xmm0
vaddss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vaddss xmm4,xmm3,xmm3
vunpckhps xmm5,xmm0,xmm0
vaddss xmm6,xmm5,xmm5
vshufps xmm0,xmm0,xmm0,0FF
vmulss xmm7,xmm2,xmm0
vmulss xmm8,xmm4,xmm0
vmulss xmm0,xmm6,xmm0
vmulss xmm2,xmm2,xmm1
vmulss xmm9,xmm4,xmm1
vmulss xmm1,xmm6,xmm1
vmulss xmm4,xmm4,xmm3
vmulss xmm3,xmm6,xmm3
vmulss xmm5,xmm6,xmm5
vmovss xmm6,dword ptr [7FF9AB2D5410]
vsubss xmm10,xmm6,xmm4
vsubss xmm10,xmm10,xmm5
vmovups xmm11,[7FF9AB2D5420]
vmovaps xmm12,xmm11
vmulss xmm10,xmm10,xmm12
vsubss xmm13,xmm9,xmm0
vmovshdup xmm14,xmm11
vmulss xmm13,xmm13,xmm14
vaddss xmm10,xmm10,xmm13
vaddss xmm13,xmm1,xmm8
vunpckhps xmm11,xmm11,xmm11
vmulss xmm13,xmm13,xmm11
vaddss xmm10,xmm10,xmm13
vaddss xmm0,xmm9,xmm0
vmulss xmm0,xmm0,xmm12
vsubss xmm2,xmm6,xmm2
vsubss xmm5,xmm2,xmm5
vmulss xmm5,xmm5,xmm14
vaddss xmm0,xmm0,xmm5
vsubss xmm5,xmm3,xmm7
vmulss xmm5,xmm5,xmm11
vaddss xmm0,xmm0,xmm5
vinsertps xmm0,xmm10,xmm0,10
vsubss xmm1,xmm1,xmm8
vmulss xmm1,xmm1,xmm12
vaddss xmm3,xmm3,xmm7
vmulss xmm3,xmm3,xmm14
vaddss xmm1,xmm1,xmm3
vsubss xmm2,xmm2,xmm4
vmulss xmm2,xmm2,xmm11
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,28
vmovsd qword ptr [rdx],xmm0
vextractps dword ptr [rdx+8],xmm0,2
mov rax,rdx
vmovaps xmm6,[rsp+80]
vmovaps xmm7,[rsp+70]
vmovaps xmm8,[rsp+60]
vmovaps xmm9,[rsp+50]
vmovaps xmm10,[rsp+40]
vmovaps xmm11,[rsp+30]
vmovaps xmm12,[rsp+20]
vmovaps xmm13,[rsp+10]
vmovaps xmm14,[rsp]
add rsp,98
ret
; Total bytes of code 377 DocsProfiling workflow for dotnet/runtime repository
Regressions in System.Numerics.Tests.Perf_Plane
Reprogit clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Plane*' PayloadsHistogramSystem.Numerics.Tests.Perf_Plane.TransformByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Plane.TransformByQuaternionBenchmark()
sub rsp,78
vzeroupper
vmovaps [rsp+60],xmm6
vmovaps [rsp+50],xmm7
vmovaps [rsp+40],xmm8
vmovaps [rsp+30],xmm9
vmovaps [rsp+20],xmm10
vmovaps [rsp+10],xmm11
vmovaps [rsp],xmm12
vmovups xmm0,[7FF9B8F753D0]
vmovaps xmm1,xmm0
vaddss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vaddss xmm4,xmm3,xmm3
vunpckhps xmm5,xmm0,xmm0
vaddss xmm6,xmm5,xmm5
vshufps xmm0,xmm0,xmm0,0FF
vmulss xmm7,xmm2,xmm0
vmulss xmm8,xmm4,xmm0
vmulss xmm0,xmm6,xmm0
vmulss xmm2,xmm2,xmm1
vmulss xmm9,xmm4,xmm1
vmulss xmm1,xmm6,xmm1
vmulss xmm4,xmm4,xmm3
vmulss xmm3,xmm6,xmm3
vmulss xmm5,xmm6,xmm5
vmovss xmm6,dword ptr [7FF9B8F753E0]
vsubss xmm10,xmm6,xmm4
vsubss xmm10,xmm10,xmm5
vsubss xmm11,xmm9,xmm0
vaddss xmm12,xmm1,xmm8
vaddss xmm0,xmm9,xmm0
vsubss xmm2,xmm6,xmm2
vsubss xmm5,xmm2,xmm5
vsubss xmm6,xmm3,xmm7
vsubss xmm1,xmm1,xmm8
vaddss xmm3,xmm3,xmm7
vsubss xmm2,xmm2,xmm4
vmovups xmm4,[7FF9B8F753F0]
vmovaps xmm7,xmm4
vmovshdup xmm8,xmm4
vunpckhps xmm9,xmm4,xmm4
vmulss xmm10,xmm7,xmm10
vmulss xmm11,xmm8,xmm11
vaddss xmm10,xmm10,xmm11
vmulss xmm11,xmm9,xmm12
vaddss xmm10,xmm10,xmm11
vmulss xmm0,xmm7,xmm0
vmulss xmm5,xmm8,xmm5
vaddss xmm0,xmm0,xmm5
vmulss xmm5,xmm9,xmm6
vaddss xmm0,xmm0,xmm5
vinsertps xmm0,xmm10,xmm0,10
vmulss xmm1,xmm7,xmm1
vmulss xmm3,xmm8,xmm3
vaddss xmm1,xmm1,xmm3
vmulss xmm2,xmm9,xmm2
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vshufps xmm1,xmm4,xmm4,0FF
vinsertps xmm0,xmm0,xmm1,30
vmovups [rdx],xmm0
mov rax,rdx
vmovaps xmm6,[rsp+60]
vmovaps xmm7,[rsp+50]
vmovaps xmm8,[rsp+40]
vmovaps xmm9,[rsp+30]
vmovaps xmm10,[rsp+20]
vmovaps xmm11,[rsp+10]
vmovaps xmm12,[rsp]
add rsp,78
ret
; Total bytes of code 336 DocsProfiling workflow for dotnet/runtime repository |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsRun Information
Regressions in System.Numerics.Tests.Perf_Quaternion
Reprogit clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Quaternion*' PayloadsHistogramSystem.Numerics.Tests.Perf_Quaternion.LengthSquaredBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Quaternion.LengthSquaredBenchmark()
vzeroupper
vmovups xmm0,[7FFB80444EB0]
vdpps xmm0,xmm0,[7FFB80444EB0],0FF
ret
; Total bytes of code 22 System.Numerics.Tests.Perf_Quaternion.LengthBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Quaternion.LengthBenchmark()
vzeroupper
vmovups xmm0,[7FF8FF1E4EB0]
vdpps xmm0,xmm0,[7FF8FF1E4EB0],0FF
vsqrtss xmm0,xmm0,xmm0
ret
; Total bytes of code 26 System.Numerics.Tests.Perf_Quaternion.SlerpBenchmark
Description of detection logic
; System.Numerics.Quaternion.Lerp(System.Numerics.Quaternion, System.Numerics.Quaternion, Single)
sub rsp,78
vzeroupper
vmovaps [rsp+60],xmm6
vmovaps [rsp+50],xmm7
vmovaps [rsp+40],xmm8
vmovaps [rsp+30],xmm9
vmovaps [rsp+20],xmm10
vmovaps [rsp+10],xmm11
vmovaps [rsp],xmm12
vmovss xmm0,dword ptr [r8]
vmovss xmm1,dword ptr [r8+4]
vmovss xmm2,dword ptr [r8+8]
vmovss xmm4,dword ptr [r8+0C]
vmovss xmm5,dword ptr [rdx]
vmovss xmm6,dword ptr [rdx+4]
vmovss xmm7,dword ptr [rdx+8]
vmovss xmm8,dword ptr [rdx+0C]
vmovss xmm9,dword ptr [7FFC1D7251F0]
vsubss xmm10,xmm9,xmm3
vmulss xmm11,xmm5,xmm0
vmulss xmm12,xmm6,xmm1
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm7,xmm2
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm8,xmm4
vaddss xmm11,xmm11,xmm12
vxorps xmm12,xmm12,xmm12
vucomiss xmm11,xmm12
jb short M01_L00
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vaddss xmm0,xmm5,xmm0
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vaddss xmm1,xmm5,xmm1
vmulss xmm5,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vaddss xmm2,xmm5,xmm2
vmulss xmm5,xmm10,xmm8
vmulss xmm3,xmm3,xmm4
vaddss xmm3,xmm5,xmm3
jmp short M01_L01
M01_L00:
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vsubss xmm0,xmm5,xmm0
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vsubss xmm1,xmm5,xmm1
vmulss xmm5,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vsubss xmm2,xmm5,xmm2
vmulss xmm5,xmm10,xmm8
vmulss xmm3,xmm3,xmm4
vsubss xmm3,xmm5,xmm3
M01_L01:
vmulss xmm4,xmm0,xmm0
vmulss xmm5,xmm1,xmm1
vaddss xmm4,xmm4,xmm5
vmulss xmm5,xmm2,xmm2
vaddss xmm4,xmm4,xmm5
vmulss xmm5,xmm3,xmm3
vaddss xmm4,xmm4,xmm5
vsqrtss xmm4,xmm4,xmm4
vdivss xmm4,xmm9,xmm4
vmulss xmm0,xmm0,xmm4
vmulss xmm1,xmm1,xmm4
vmulss xmm2,xmm2,xmm4
vmulss xmm3,xmm3,xmm4
vmovss dword ptr [rcx],xmm0
vmovss dword ptr [rcx+4],xmm1
vmovss dword ptr [rcx+8],xmm2
vmovss dword ptr [rcx+0C],xmm3
mov rax,rcx
vmovaps xmm6,[rsp+60]
vmovaps xmm7,[rsp+50]
vmovaps xmm8,[rsp+40]
vmovaps xmm9,[rsp+30]
vmovaps xmm10,[rsp+20]
vmovaps xmm11,[rsp+10]
vmovaps xmm12,[rsp]
add rsp,78
ret
; Total bytes of code 365 Compare Jit Disasm; System.Numerics.Tests.Perf_Quaternion.SlerpBenchmark()
push rsi
sub rsp,40
vzeroupper
mov rsi,rdx
vxorps xmm3,xmm3,xmm3
vmovaps [rsp+30],xmm3
vmovups xmm3,[7FF8FF1E5030]
vmovaps [rsp+20],xmm3
mov rcx,rsi
lea r8,[rsp+20]
lea rdx,[rsp+30]
vmovss xmm3,dword ptr [7FF8FF1E5040]
call qword ptr [7FF8FF5A1CA8]; System.Numerics.Quaternion.Lerp(System.Numerics.Quaternion, System.Numerics.Quaternion, Single)
mov rax,rsi
add rsp,40
pop rsi
ret
; Total bytes of code 71 ; System.Numerics.Quaternion.Lerp(System.Numerics.Quaternion, System.Numerics.Quaternion, Single)
sub rsp,78
vzeroupper
vmovaps [rsp+60],xmm6
vmovaps [rsp+50],xmm7
vmovaps [rsp+40],xmm8
vmovaps [rsp+30],xmm9
vmovaps [rsp+20],xmm10
vmovaps [rsp+10],xmm11
vmovaps [rsp],xmm12
vmovss xmm0,dword ptr [r8]
vmovss xmm1,dword ptr [r8+4]
vmovss xmm2,dword ptr [r8+8]
vmovss xmm4,dword ptr [r8+0C]
vmovss xmm5,dword ptr [rdx]
vmovss xmm6,dword ptr [rdx+4]
vmovss xmm7,dword ptr [rdx+8]
vmovss xmm8,dword ptr [rdx+0C]
vmovss xmm9,dword ptr [7FF8FF1E5228]
vsubss xmm10,xmm9,xmm3
vmulss xmm11,xmm5,xmm0
vmulss xmm12,xmm6,xmm1
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm7,xmm2
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm8,xmm4
vaddss xmm11,xmm11,xmm12
vxorps xmm12,xmm12,xmm12
vucomiss xmm11,xmm12
jb short M01_L00
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vaddss xmm0,xmm5,xmm0
vinsertps xmm0,xmm0,xmm0,0E
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vaddss xmm1,xmm5,xmm1
vinsertps xmm0,xmm0,xmm1,10
vmulss xmm1,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vmulss xmm1,xmm10,xmm8
vmulss xmm3,xmm3,xmm4
vaddss xmm4,xmm1,xmm3
vinsertps xmm0,xmm0,xmm4,30
jmp short M01_L01
M01_L00:
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vsubss xmm0,xmm5,xmm0
vinsertps xmm0,xmm0,xmm0,0E
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vsubss xmm1,xmm5,xmm1
vinsertps xmm0,xmm0,xmm1,10
vmulss xmm1,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vsubss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vmulss xmm1,xmm10,xmm8
vmulss xmm2,xmm3,xmm4
vsubss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,30
M01_L01:
vmovaps xmm1,xmm0
vmulss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vmulss xmm3,xmm3,xmm3
vaddss xmm2,xmm2,xmm3
vunpckhps xmm3,xmm0,xmm0
vmulss xmm3,xmm3,xmm3
vaddss xmm2,xmm2,xmm3
vshufps xmm3,xmm0,xmm0,0FF
vmulss xmm3,xmm3,xmm3
vaddss xmm2,xmm2,xmm3
vsqrtss xmm2,xmm2,xmm2
vdivss xmm2,xmm9,xmm2
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,0
vmovshdup xmm1,xmm0
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,10
vunpckhps xmm1,xmm0,xmm0
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vshufps xmm1,xmm0,xmm0,0FF
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,30
vmovups [rcx],xmm0
mov rax,rcx
vmovaps xmm6,[rsp+60]
vmovaps xmm7,[rsp+50]
vmovaps xmm8,[rsp+40]
vmovaps xmm9,[rsp+30]
vmovaps xmm10,[rsp+20]
vmovaps xmm11,[rsp+10]
vmovaps xmm12,[rsp]
add rsp,78
ret
; Total bytes of code 452 System.Numerics.Tests.Perf_Quaternion.LerpBenchmark
Description of detection logic
; System.Numerics.Quaternion.Lerp(System.Numerics.Quaternion, System.Numerics.Quaternion, Single)
sub rsp,78
vzeroupper
vmovaps [rsp+60],xmm6
vmovaps [rsp+50],xmm7
vmovaps [rsp+40],xmm8
vmovaps [rsp+30],xmm9
vmovaps [rsp+20],xmm10
vmovaps [rsp+10],xmm11
vmovaps [rsp],xmm12
vmovss xmm0,dword ptr [r8]
vmovss xmm1,dword ptr [r8+4]
vmovss xmm2,dword ptr [r8+8]
vmovss xmm4,dword ptr [r8+0C]
vmovss xmm5,dword ptr [rdx]
vmovss xmm6,dword ptr [rdx+4]
vmovss xmm7,dword ptr [rdx+8]
vmovss xmm8,dword ptr [rdx+0C]
vmovss xmm9,dword ptr [7FFBA68B51F0]
vsubss xmm10,xmm9,xmm3
vmulss xmm11,xmm5,xmm0
vmulss xmm12,xmm6,xmm1
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm7,xmm2
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm8,xmm4
vaddss xmm11,xmm11,xmm12
vxorps xmm12,xmm12,xmm12
vucomiss xmm11,xmm12
jb short M01_L00
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vaddss xmm0,xmm5,xmm0
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vaddss xmm1,xmm5,xmm1
vmulss xmm5,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vaddss xmm2,xmm5,xmm2
vmulss xmm5,xmm10,xmm8
vmulss xmm3,xmm3,xmm4
vaddss xmm3,xmm5,xmm3
jmp short M01_L01
M01_L00:
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vsubss xmm0,xmm5,xmm0
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vsubss xmm1,xmm5,xmm1
vmulss xmm5,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vsubss xmm2,xmm5,xmm2
vmulss xmm5,xmm10,xmm8
vmulss xmm3,xmm3,xmm4
vsubss xmm3,xmm5,xmm3
M01_L01:
vmulss xmm4,xmm0,xmm0
vmulss xmm5,xmm1,xmm1
vaddss xmm4,xmm4,xmm5
vmulss xmm5,xmm2,xmm2
vaddss xmm4,xmm4,xmm5
vmulss xmm5,xmm3,xmm3
vaddss xmm4,xmm4,xmm5
vsqrtss xmm4,xmm4,xmm4
vdivss xmm4,xmm9,xmm4
vmulss xmm0,xmm0,xmm4
vmulss xmm1,xmm1,xmm4
vmulss xmm2,xmm2,xmm4
vmulss xmm3,xmm3,xmm4
vmovss dword ptr [rcx],xmm0
vmovss dword ptr [rcx+4],xmm1
vmovss dword ptr [rcx+8],xmm2
vmovss dword ptr [rcx+0C],xmm3
mov rax,rcx
vmovaps xmm6,[rsp+60]
vmovaps xmm7,[rsp+50]
vmovaps xmm8,[rsp+40]
vmovaps xmm9,[rsp+30]
vmovaps xmm10,[rsp+20]
vmovaps xmm11,[rsp+10]
vmovaps xmm12,[rsp]
add rsp,78
ret
; Total bytes of code 365 Compare Jit Disasm; System.Numerics.Tests.Perf_Quaternion.LerpBenchmark()
push rsi
sub rsp,40
vzeroupper
mov rsi,rdx
vxorps xmm3,xmm3,xmm3
vmovaps [rsp+30],xmm3
vmovups xmm3,[7FFDA9A45030]
vmovaps [rsp+20],xmm3
mov rcx,rsi
lea r8,[rsp+20]
lea rdx,[rsp+30]
vmovss xmm3,dword ptr [7FFDA9A45040]
call qword ptr [7FFDA9E01CA8]; System.Numerics.Quaternion.Lerp(System.Numerics.Quaternion, System.Numerics.Quaternion, Single)
mov rax,rsi
add rsp,40
pop rsi
ret
; Total bytes of code 71 ; System.Numerics.Quaternion.Lerp(System.Numerics.Quaternion, System.Numerics.Quaternion, Single)
sub rsp,78
vzeroupper
vmovaps [rsp+60],xmm6
vmovaps [rsp+50],xmm7
vmovaps [rsp+40],xmm8
vmovaps [rsp+30],xmm9
vmovaps [rsp+20],xmm10
vmovaps [rsp+10],xmm11
vmovaps [rsp],xmm12
vmovss xmm0,dword ptr [r8]
vmovss xmm1,dword ptr [r8+4]
vmovss xmm2,dword ptr [r8+8]
vmovss xmm4,dword ptr [r8+0C]
vmovss xmm5,dword ptr [rdx]
vmovss xmm6,dword ptr [rdx+4]
vmovss xmm7,dword ptr [rdx+8]
vmovss xmm8,dword ptr [rdx+0C]
vmovss xmm9,dword ptr [7FFDA9A45228]
vsubss xmm10,xmm9,xmm3
vmulss xmm11,xmm5,xmm0
vmulss xmm12,xmm6,xmm1
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm7,xmm2
vaddss xmm11,xmm11,xmm12
vmulss xmm12,xmm8,xmm4
vaddss xmm11,xmm11,xmm12
vxorps xmm12,xmm12,xmm12
vucomiss xmm11,xmm12
jb short M01_L00
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vaddss xmm0,xmm5,xmm0
vinsertps xmm0,xmm0,xmm0,0E
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vaddss xmm1,xmm5,xmm1
vinsertps xmm0,xmm0,xmm1,10
vmulss xmm1,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vmulss xmm1,xmm10,xmm8
vmulss xmm3,xmm3,xmm4
vaddss xmm4,xmm1,xmm3
vinsertps xmm0,xmm0,xmm4,30
jmp short M01_L01
M01_L00:
vmulss xmm5,xmm10,xmm5
vmulss xmm0,xmm3,xmm0
vsubss xmm0,xmm5,xmm0
vinsertps xmm0,xmm0,xmm0,0E
vmulss xmm5,xmm10,xmm6
vmulss xmm1,xmm3,xmm1
vsubss xmm1,xmm5,xmm1
vinsertps xmm0,xmm0,xmm1,10
vmulss xmm1,xmm10,xmm7
vmulss xmm2,xmm3,xmm2
vsubss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vmulss xmm1,xmm10,xmm8
vmulss xmm2,xmm3,xmm4
vsubss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,30
M01_L01:
vmovaps xmm1,xmm0
vmulss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vmulss xmm3,xmm3,xmm3
vaddss xmm2,xmm2,xmm3
vunpckhps xmm3,xmm0,xmm0
vmulss xmm3,xmm3,xmm3
vaddss xmm2,xmm2,xmm3
vshufps xmm3,xmm0,xmm0,0FF
vmulss xmm3,xmm3,xmm3
vaddss xmm2,xmm2,xmm3
vsqrtss xmm2,xmm2,xmm2
vdivss xmm2,xmm9,xmm2
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,0
vmovshdup xmm1,xmm0
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,10
vunpckhps xmm1,xmm0,xmm0
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vshufps xmm1,xmm0,xmm0,0FF
vmulss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,30
vmovups [rcx],xmm0
mov rax,rcx
vmovaps xmm6,[rsp+60]
vmovaps xmm7,[rsp+50]
vmovaps xmm8,[rsp+40]
vmovaps xmm9,[rsp+30]
vmovaps xmm10,[rsp+20]
vmovaps xmm11,[rsp+10]
vmovaps xmm12,[rsp]
add rsp,78
ret
; Total bytes of code 452 DocsProfiling workflow for dotnet/runtime repository
Regressions in System.Numerics.Tests.Perf_Vector2
Reprogit clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Vector2*' PayloadsHistogramSystem.Numerics.Tests.Perf_Vector2.TransformByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Vector2.TransformByQuaternionBenchmark()
sub rsp,38
vzeroupper
vmovaps [rsp+20],xmm6
vmovaps [rsp+10],xmm7
vmovaps [rsp],xmm8
vmovups xmm0,[7FF99F7F5280]
vmovaps xmm1,xmm0
vaddss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vaddss xmm4,xmm3,xmm3
vunpckhps xmm5,xmm0,xmm0
vaddss xmm6,xmm5,xmm5
vshufps xmm0,xmm0,xmm0,0FF
vmulss xmm0,xmm6,xmm0
vmulss xmm2,xmm2,xmm1
vmulss xmm1,xmm4,xmm1
vmulss xmm3,xmm4,xmm3
vmulss xmm4,xmm6,xmm5
vmovss xmm5,dword ptr [7FF99F7F5290]
vsubss xmm3,xmm5,xmm3
vsubss xmm3,xmm3,xmm4
vmovsd xmm6,qword ptr [7FF99F7F5298]
vmovaps xmm7,xmm6
vmulss xmm3,xmm3,xmm7
vsubss xmm8,xmm1,xmm0
vmovshdup xmm6,xmm6
vmulss xmm8,xmm8,xmm6
vaddss xmm3,xmm3,xmm8
vaddss xmm0,xmm1,xmm0
vmulss xmm0,xmm0,xmm7
vsubss xmm1,xmm5,xmm2
vsubss xmm1,xmm1,xmm4
vmulss xmm1,xmm1,xmm6
vaddss xmm0,xmm0,xmm1
vinsertps xmm0,xmm3,xmm0,1C
vmovq rax,xmm0
vmovaps xmm6,[rsp+20]
vmovaps xmm7,[rsp+10]
vmovaps xmm8,[rsp]
add rsp,38
ret
; Total bytes of code 187 DocsProfiling workflow for dotnet/runtime repository Run Information
Regressions in System.Numerics.Tests.Perf_Vector4
Reprogit clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_Vector4*' PayloadsHistogramSystem.Numerics.Tests.Perf_Vector4.TransformVector3ByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Vector4.TransformVector3ByQuaternionBenchmark()
sub rsp,98
vzeroupper
vmovaps [rsp+80],xmm6
vmovaps [rsp+70],xmm7
vmovaps [rsp+60],xmm8
vmovaps [rsp+50],xmm9
vmovaps [rsp+40],xmm10
vmovaps [rsp+30],xmm11
vmovaps [rsp+20],xmm12
vmovaps [rsp+10],xmm13
vmovaps [rsp],xmm14
vmovups xmm0,[7FF7D5035400]
vmovaps xmm1,xmm0
vaddss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vaddss xmm4,xmm3,xmm3
vunpckhps xmm5,xmm0,xmm0
vaddss xmm6,xmm5,xmm5
vshufps xmm0,xmm0,xmm0,0FF
vmulss xmm7,xmm2,xmm0
vmulss xmm8,xmm4,xmm0
vmulss xmm0,xmm6,xmm0
vmulss xmm2,xmm2,xmm1
vmulss xmm9,xmm4,xmm1
vmulss xmm1,xmm6,xmm1
vmulss xmm4,xmm4,xmm3
vmulss xmm3,xmm6,xmm3
vmulss xmm5,xmm6,xmm5
vmovss xmm6,dword ptr [7FF7D5035410]
vsubss xmm10,xmm6,xmm4
vsubss xmm10,xmm10,xmm5
vmovups xmm11,[7FF7D5035420]
vmovaps xmm12,xmm11
vmulss xmm10,xmm10,xmm12
vsubss xmm13,xmm9,xmm0
vmovshdup xmm14,xmm11
vmulss xmm13,xmm13,xmm14
vaddss xmm10,xmm10,xmm13
vaddss xmm13,xmm1,xmm8
vunpckhps xmm11,xmm11,xmm11
vmulss xmm13,xmm13,xmm11
vaddss xmm10,xmm10,xmm13
vaddss xmm0,xmm9,xmm0
vmulss xmm0,xmm0,xmm12
vsubss xmm2,xmm6,xmm2
vsubss xmm5,xmm2,xmm5
vmulss xmm5,xmm5,xmm14
vaddss xmm0,xmm0,xmm5
vsubss xmm5,xmm3,xmm7
vmulss xmm5,xmm5,xmm11
vaddss xmm0,xmm0,xmm5
vinsertps xmm0,xmm10,xmm0,10
vsubss xmm1,xmm1,xmm8
vmulss xmm1,xmm1,xmm12
vaddss xmm3,xmm3,xmm7
vmulss xmm3,xmm3,xmm14
vaddss xmm1,xmm1,xmm3
vsubss xmm2,xmm2,xmm4
vmulss xmm2,xmm2,xmm11
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vinsertps xmm0,xmm0,xmm6,30
vmovups [rdx],xmm0
mov rax,rdx
vmovaps xmm6,[rsp+80]
vmovaps xmm7,[rsp+70]
vmovaps xmm8,[rsp+60]
vmovaps xmm9,[rsp+50]
vmovaps xmm10,[rsp+40]
vmovaps xmm11,[rsp+30]
vmovaps xmm12,[rsp+20]
vmovaps xmm13,[rsp+10]
vmovaps xmm14,[rsp]
add rsp,98
ret
; Total bytes of code 376 System.Numerics.Tests.Perf_Vector4.TransformByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Vector4.TransformByQuaternionBenchmark()
sub rsp,98
vzeroupper
vmovaps [rsp+80],xmm6
vmovaps [rsp+70],xmm7
vmovaps [rsp+60],xmm8
vmovaps [rsp+50],xmm9
vmovaps [rsp+40],xmm10
vmovaps [rsp+30],xmm11
vmovaps [rsp+20],xmm12
vmovaps [rsp+10],xmm13
vmovaps [rsp],xmm14
vmovups xmm0,[7FFE7D935400]
vmovaps xmm1,xmm0
vaddss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vaddss xmm4,xmm3,xmm3
vunpckhps xmm5,xmm0,xmm0
vaddss xmm6,xmm5,xmm5
vshufps xmm0,xmm0,xmm0,0FF
vmulss xmm7,xmm2,xmm0
vmulss xmm8,xmm4,xmm0
vmulss xmm0,xmm6,xmm0
vmulss xmm2,xmm2,xmm1
vmulss xmm9,xmm4,xmm1
vmulss xmm1,xmm6,xmm1
vmulss xmm4,xmm4,xmm3
vmulss xmm3,xmm6,xmm3
vmulss xmm5,xmm6,xmm5
vmovss xmm6,dword ptr [7FFE7D935410]
vsubss xmm10,xmm6,xmm4
vsubss xmm10,xmm10,xmm5
vmovups xmm11,[7FFE7D935420]
vmovaps xmm12,xmm11
vmulss xmm10,xmm10,xmm12
vsubss xmm13,xmm9,xmm0
vmovshdup xmm14,xmm11
vmulss xmm13,xmm13,xmm14
vaddss xmm10,xmm10,xmm13
vaddss xmm13,xmm1,xmm8
vunpckhps xmm11,xmm11,xmm11
vmulss xmm13,xmm13,xmm11
vaddss xmm10,xmm10,xmm13
vaddss xmm0,xmm9,xmm0
vmulss xmm0,xmm0,xmm12
vsubss xmm2,xmm6,xmm2
vsubss xmm5,xmm2,xmm5
vmulss xmm5,xmm5,xmm14
vaddss xmm0,xmm0,xmm5
vsubss xmm5,xmm3,xmm7
vmulss xmm5,xmm5,xmm11
vaddss xmm0,xmm0,xmm5
vinsertps xmm0,xmm10,xmm0,10
vsubss xmm1,xmm1,xmm8
vmulss xmm1,xmm1,xmm12
vaddss xmm3,xmm3,xmm7
vmulss xmm3,xmm3,xmm14
vaddss xmm1,xmm1,xmm3
vsubss xmm2,xmm2,xmm4
vmulss xmm2,xmm2,xmm11
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vinsertps xmm0,xmm0,xmm6,30
vmovups [rdx],xmm0
mov rax,rdx
vmovaps xmm6,[rsp+80]
vmovaps xmm7,[rsp+70]
vmovaps xmm8,[rsp+60]
vmovaps xmm9,[rsp+50]
vmovaps xmm10,[rsp+40]
vmovaps xmm11,[rsp+30]
vmovaps xmm12,[rsp+20]
vmovaps xmm13,[rsp+10]
vmovaps xmm14,[rsp]
add rsp,98
ret
; Total bytes of code 376 System.Numerics.Tests.Perf_Vector4.TransformVector2ByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm; System.Numerics.Tests.Perf_Vector4.TransformVector2ByQuaternionBenchmark()
sub rsp,78
vzeroupper
vmovaps [rsp+60],xmm6
vmovaps [rsp+50],xmm7
vmovaps [rsp+40],xmm8
vmovaps [rsp+30],xmm9
vmovaps [rsp+20],xmm10
vmovaps [rsp+10],xmm11
vmovaps [rsp],xmm12
vmovups xmm0,[7FFBC9A453B0]
vmovaps xmm1,xmm0
vaddss xmm2,xmm1,xmm1
vmovshdup xmm3,xmm0
vaddss xmm4,xmm3,xmm3
vunpckhps xmm5,xmm0,xmm0
vaddss xmm6,xmm5,xmm5
vshufps xmm0,xmm0,xmm0,0FF
vmulss xmm7,xmm2,xmm0
vmulss xmm8,xmm4,xmm0
vmulss xmm0,xmm6,xmm0
vmulss xmm2,xmm2,xmm1
vmulss xmm9,xmm4,xmm1
vmulss xmm1,xmm6,xmm1
vmulss xmm4,xmm4,xmm3
vmulss xmm3,xmm6,xmm3
vmulss xmm5,xmm6,xmm5
vmovss xmm6,dword ptr [7FFBC9A453C0]
vsubss xmm4,xmm6,xmm4
vsubss xmm4,xmm4,xmm5
vmovsd xmm10,qword ptr [7FFBC9A453C8]
vmovaps xmm11,xmm10
vmulss xmm4,xmm4,xmm11
vsubss xmm12,xmm9,xmm0
vmovshdup xmm10,xmm10
vmulss xmm12,xmm12,xmm10
vaddss xmm4,xmm4,xmm12
vaddss xmm0,xmm9,xmm0
vmulss xmm0,xmm0,xmm11
vsubss xmm2,xmm6,xmm2
vsubss xmm2,xmm2,xmm5
vmulss xmm2,xmm2,xmm10
vaddss xmm0,xmm0,xmm2
vinsertps xmm0,xmm4,xmm0,10
vsubss xmm1,xmm1,xmm8
vmulss xmm1,xmm1,xmm11
vaddss xmm2,xmm3,xmm7
vmulss xmm2,xmm2,xmm10
vaddss xmm1,xmm1,xmm2
vinsertps xmm0,xmm0,xmm1,20
vinsertps xmm0,xmm0,xmm6,30
vmovups [rdx],xmm0
mov rax,rdx
vmovaps xmm6,[rsp+60]
vmovaps xmm7,[rsp+50]
vmovaps xmm8,[rsp+40]
vmovaps xmm9,[rsp+30]
vmovaps xmm10,[rsp+20]
vmovaps xmm11,[rsp+10]
vmovaps xmm12,[rsp]
add rsp,78
ret
; Total bytes of code 294 DocsProfiling workflow for dotnet/runtime repository
|
Thanks! Many of these were known ahead of time and were due to them being "bad benchmarks". That is, they only involved constant inputs of promotable structs and so had large amounts of constant folding leading to non-representative results in the benchmarks. Will leave this open until we finish the work to "fix" the tests. |
Moving to future as this is a benchmark issue and not something critical to shipping .NET 8. The actual code for the types in question has been significantly improved for real world scenarios. |
Run Information
Regressions in System.Numerics.Tests.Perf_Quaternion
Test Report
Repro
Payloads
Baseline
Compare
Histogram
System.Numerics.Tests.Perf_Quaternion.LengthSquaredBenchmark
Description of detection logic
Compare Jit Disasm
System.Numerics.Tests.Perf_Quaternion.LengthBenchmark
Description of detection logic
Compare Jit Disasm
System.Numerics.Tests.Perf_Quaternion.SlerpBenchmark
Description of detection logic
Compare Jit Disasm
System.Numerics.Tests.Perf_Quaternion.LerpBenchmark
Description of detection logic
Compare Jit Disasm
Docs
Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository
Regressions in System.Numerics.Tests.Perf_Vector2
Test Report
Repro
Payloads
Baseline
Compare
Histogram
System.Numerics.Tests.Perf_Vector2.TransformByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm
Docs
Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository
Run Information
Regressions in System.Numerics.Tests.Perf_Vector4
Test Report
Repro
Payloads
Baseline
Compare
Histogram
System.Numerics.Tests.Perf_Vector4.TransformVector3ByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm
System.Numerics.Tests.Perf_Vector4.TransformByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm
System.Numerics.Tests.Perf_Vector4.TransformVector2ByQuaternionBenchmark
Description of detection logic
Compare Jit Disasm
Docs
Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository
The text was updated successfully, but these errors were encountered: