-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT not able to inline methods that contain generic virtual calls #59075
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsDescriptionI am trying to make a static reflection system in C#, which is basically to enumerate a struct's fields by ref without requiring any reflection from the runtime. I found that the performance does not fully meet the expectation, probably due to some generic methods not being inlined. The source code used in benchmarking is here. ConfigurationBenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1165 (21H1/May2021Update) Regression?No. Data
Source code is in the Description section above. The two methods For The disassembly of the two methods are shown below. The shared method
Reflected; StaticReflectionTest.TestClass.Reflected()
sub rsp,38
xor eax,eax
mov [rsp+28],rax
mov [rsp+30],rax
lea rcx,[rsp+28]
call StaticReflectionTest.TestClass.CreateState()
lea rcx,[rsp+28]
call StaticReflectionTest.TestClass.ResetVelocity[[StaticReflectionTest.EntityState2D, StaticReflectionTest]](StaticReflectionTest.EntityState2D ByRef)
nop
add rsp,38
ret
; Total bytes of code 42
; StaticReflectionTest.TestClass.ResetVelocity[[StaticReflectionTest.EntityState2D, StaticReflectionTest]](StaticReflectionTest.EntityState2D ByRef)
sub rsp,28
xor eax,eax
mov [rsp+20],rax
lea rdx,[rsp+20]
call StaticReflectionTest.EntityState2D.Reflect[[StaticReflectionTest.TestClass+ResetVelocityConsumer, StaticReflectionTest]](ResetVelocityConsumer ByRef)
nop
add rsp,28
ret
; Total bytes of code 27
; StaticReflectionTest.EntityState2D.Reflect[[StaticReflectionTest.TestClass+ResetVelocityConsumer, StaticReflectionTest]](ResetVelocityConsumer ByRef)
add rcx,8
xor eax,eax
mov [rcx],rax
ret
; Total bytes of code 10 Direct (reference); StaticReflectionTest.TestClass.Direct()
sub rsp,38
vzeroupper
xor eax,eax
mov [rsp+28],rax
mov [rsp+30],rax
lea rcx,[rsp+28]
call StaticReflectionTest.TestClass.CreateState()
vxorps xmm0,xmm0,xmm0
vmovsd qword ptr [rsp+30],xmm0
add rsp,38
ret
; Total bytes of code 44
|
Probably related to #59002. |
It's a known issue - inliner gives up on virtual calls over generics currently ( |
A similar issue: using System;
class Program : IDisposable
{
public static void Main()
{
Program r = new ();
Test(ref r);
}
static void Test<T>(ref T foo)
where T : IDisposable => foo.Dispose();
public void Dispose() {}
} codegen for main: ; Method Program:Main()
G_M27646_IG01: ;; offset=0000H
4883EC28 sub rsp, 40
;; bbWeight=1 PerfScore 0.25
G_M27646_IG02: ;; offset=0004H
48B9C01538C1FB7F0000 mov rcx, 0x7FFBC13815C0
E85D9DA15F call CORINFO_HELP_NEWSFAST
488BC8 mov rcx, rax
49BB4000D9C0FB7F0000 mov r11, 0x7FFBC0D90040
FF155AD0E7FF call [System.IDisposable:Dispose():this]
90 nop
;; bbWeight=1 PerfScore 5.00
G_M27646_IG03: ;; offset=0027H
4883C428 add rsp, 40
C3 ret
;; bbWeight=1 PerfScore 1.25
; Total bytes of code: 44 |
Could you elaborate on that? I think both EDIT I changed |
Description
I am trying to make a static reflection system in C#, which is basically to enumerate a struct's fields by ref without requiring any reflection from the runtime. I found that the performance does not fully meet the expectation, probably due to some generic methods not being inlined.
The source code used in benchmarking is here.
Configuration
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1165 (21H1/May2021Update)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-preview.7.21379.14
[Host] : .NET 6.0.0 (6.0.21.37719), X64 RyuJIT [AttachedDebugger]
DefaultJob : .NET 6.0.0 (6.0.21.37719), X64 RyuJIT
Regression?
No.
Data
Source code is in the Description section above. The two methods
Direct
andReflected
are equivalent.For
Direct
, much of the time should be inCreateState()
, which is unrelated to the issue. Considering avxorps + vmovsd
should be fast, the difference between the two indicates that the method calls not inlined would have huge effect on performance, especially when the reflection is done inside a tight loop, which is the expected usage of this static reflection mechanism.The disassembly of the two methods are shown below. Note that in
Reflected
, these two methods are not inlined:Reflected
Direct (reference)
Shared method
CreateState()
category:cq
theme:inlining
skill-level:expert
cost:medium
impact:medium
The text was updated successfully, but these errors were encountered: