Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf -29%] PerfLabTests.CastingPerf (2) #37803

Closed
DrewScoggins opened this issue Jun 12, 2020 · 7 comments
Closed

[Perf -29%] PerfLabTests.CastingPerf (2) #37803

DrewScoggins opened this issue Jun 12, 2020 · 7 comments
Assignees
Labels
Milestone

Comments

@DrewScoggins
Copy link
Member

Run Information

Architecture x64
OS Windows 10.0.18362
Changes diff

Regressions in PerfLabTests.CastingPerf

Benchmark Baseline Test Test/Base Modality Baseline Outlier
CheckObjIsInterfaceNo 218.29 μs 280.64 μs 1.29 False
CheckIsInstAnyIsInterfaceNo 218.29 μs 280.65 μs 1.29 False

graph
graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'PerfLabTests.CastingPerf*';

Histogram

PerfLabTests.CastingPerf.CheckObjIsInterfaceNo

[217888.775 ; 232960.149) | @@@@@@@@@@@@
[232960.149 ; 252981.511) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@
[252981.511 ; 270375.304) | 
[270375.304 ; 285446.677) | @@@@@@@@@@@@@@@@@@@

PerfLabTests.CastingPerf.CheckIsInstAnyIsInterfaceNo

[209794.928 ; 226755.346) | @@@@@@@@@@@@@
[226755.346 ; 237218.013) | 
[237218.013 ; 254170.131) | @@@@@@@@@@@@@@@@@@@@@@@@@
[254170.131 ; 269951.518) | 
[269951.518 ; 286903.637) | @@@@@@@@@@@@@@@@@@@@
[286903.637 ; 303354.976) | 
[303354.976 ; 320307.095) | @

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Jun 12, 2020
@DrewScoggins DrewScoggins added the tenet-performance-benchmarks Issue from performance benchmark label Jun 12, 2020
@DrewScoggins
Copy link
Member Author

@adamsitnik @VSadov

@jkotas jkotas added area-VM-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Jun 12, 2020
@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jun 22, 2020
@mangod9 mangod9 added this to the 5.0.0 milestone Jun 22, 2020
@DrewScoggins DrewScoggins added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark os-windows arch-x64 and removed tenet-performance-benchmarks Issue from performance benchmark labels Jul 7, 2020
@VSadov VSadov self-assigned this Jul 21, 2020
@VSadov
Copy link
Member

VSadov commented Jul 21, 2020

Could be the cost of tiered compiler indirection showing up on a fast cast. Native implementation did not have that expense.

The cost of tiering indirection could be easily avoided by opting out the casting helper from tiering. That would mean jitting it at start up time. We decided against that, since these are "fast" casts and generally not a perf concern.

I will take a look to be sure.

@mangod9
Copy link
Member

mangod9 commented Aug 6, 2020

@VSadov the last time we chatted I believe you mentioned this perf was reasonable. Is there anything actionable here?

@VSadov
Copy link
Member

VSadov commented Aug 6, 2020

@mangod9 Right. I just looked at this again to be sure. The code that we run here is the same as in the native implementation - these are simple cases that just compare the target interface against implemented interfaces (which in these benchmarks is a list of length 1).

The difference is that the managed implementation is called via an extra indirection due to tiering callsite, which is a trade off for the start up latency.
The only reason why we see the impact of an extra jmp is because the operation is extremely fast. The measurements are for loops that do 100000 casts.

I think we can accept the current performance. It is unlikely to cause issues in real world scenarios.

@mangod9
Copy link
Member

mangod9 commented Aug 6, 2020

thanks for the follow up. Will close this out. @DrewScoggins is there anything required to track any further regressions in the future?

@VSadov
Copy link
Member

VSadov commented Aug 6, 2020

BTW, I like that we have tests sensitive to this.

@VSadov
Copy link
Member

VSadov commented Aug 6, 2020

Also, I think I can resolve #1994 , which was asking for impact of tiering on casting and it is exactly what we are looking at here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants