-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable JitDisasm in Release #73365
Enable JitDisasm in Release #73365
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis PR enables It doesn't print lcl tables and gc info (for now). Adds ~30kb to the binary size of jit. The only noticeable impact on Release is that Example: public class Tests
{
static int staticField;
string instanceField;
public virtual void VirtualCall() {}
public void Test(string[] args, IDisposable d)
{
// control flow
if (args?.Length == 0)
{
Console.WriteLine("field: " + staticField);
Console.WriteLine(instanceField);
}
else
{
// SIMD (data section)
Console.WriteLine(Vector128.Create(1, 2, 3, 4));
}
// Virtual call
VirtualCall();
// Interface call
d?.Dispose();
// Jump table
switch (args.Length)
{
case 1:
Console.WriteLine("1");
break;
case 2:
Console.WriteLine("2");
break;
case 3:
Console.WriteLine("3");
break;
case 4:
Console.WriteLine("5");
break;
case 6:
Console.WriteLine("7");
break;
}
Console.ReadKey();
}
} ; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data
IG_01:
57 push rdi
56 push rsi
53 push rbx
4883EC30 sub rsp, 48
C5F877 vzeroupper
488BF1 mov rsi, rcx
488BFA mov rdi, rdx
498BD8 mov rbx, r8
IG_02:
4885FF test rdi, rdi
743D je SHORT G_M000_IG04
IG_03:
837F0800 cmp dword ptr [rdi+8], 0
7537 jne SHORT G_M000_IG04
8B0DF8B40C00 mov ecx, dword ptr [(reloc 0x7ffd2d2dc06c)]
FF15560A1100 call [Number:Int32ToDecStr(int):String]
488BD0 mov rdx, rax
48B9D820C0EF74020000 mov rcx, 0x274EFC020D8
488B09 mov rcx, gword ptr [rcx]
FF15D0100900 call [String:Concat(String,String):String]
488BC8 mov rcx, rax
FF15FFAE0F00 call [Console:WriteLine(String)]
488B4E08 mov rcx, gword ptr [rsi+8]
FF15F5AE0F00 call [Console:WriteLine(String)]
EB27 jmp SHORT G_M000_IG05
IG_04:
48B978D62F2DFD7F0000 mov rcx, 0x7FFD2D2FD678
E8FCAEAF5F call CORINFO_HELP_NEWSFAST
C5F91005E4000000 vmovupd xmm0, xmmword ptr [reloc @RWD00]
C5F9114008 vmovupd xmmword ptr [rax+8], xmm0
488BC8 mov rcx, rax
FF15B6AE0F00 call [Console:WriteLine(Object)]
IG_05:
488BCE mov rcx, rsi
488B06 mov rax, qword ptr [rsi]
488B4040 mov rax, qword ptr [rax+64]
FF5020 call [rax+32]Tests:VirtualCall():this
4885DB test rbx, rbx
7410 je SHORT G_M000_IG07
IG_06:
488BCB mov rcx, rbx
49BB0800062DFD7F0000 mov r11, 0x7FFD2D060008
41FF13 call [r11]IDisposable:Dispose():this
IG_07:
8B4F08 mov ecx, dword ptr [rdi+8]
FFC9 dec ecx
83F905 cmp ecx, 5
0F877F000000 ja G_M000_IG13
8BC9 mov ecx, ecx
488D05AD000000 lea rax, [reloc @RWD16]
8B0488 mov eax, dword ptr [rax+4*rcx]
488D1556FFFFFF lea rdx, G_M000_IG02
4803C2 add rax, rdx
FFE0 jmp rax
IG_08:
48B9E020C0EF74020000 mov rcx, 0x274EFC020E0
488B09 mov rcx, gword ptr [rcx]
FF1573AE0F00 call [Console:WriteLine(String)]
EB52 jmp SHORT G_M000_IG13
IG_09:
48B9E820C0EF74020000 mov rcx, 0x274EFC020E8
488B09 mov rcx, gword ptr [rcx]
FF155EAE0F00 call [Console:WriteLine(String)]
EB3D jmp SHORT G_M000_IG13
IG_10:
48B9F020C0EF74020000 mov rcx, 0x274EFC020F0
488B09 mov rcx, gword ptr [rcx]
FF1549AE0F00 call [Console:WriteLine(String)]
EB28 jmp SHORT G_M000_IG13
IG_11:
48B9F820C0EF74020000 mov rcx, 0x274EFC020F8
488B09 mov rcx, gword ptr [rcx]
FF1534AE0F00 call [Console:WriteLine(String)]
EB13 jmp SHORT G_M000_IG13
IG_12:
48B90021C0EF74020000 mov rcx, 0x274EFC02100
488B09 mov rcx, gword ptr [rcx]
FF151FAE0F00 call [Console:WriteLine(String)]
IG_13:
488D4C2420 lea rcx, bword ptr
33D2 xor edx, edx
FF158AEE0F00 call [ConsolePal:ReadKey(bool):ConsoleKeyInfo]
90 nop
IG_14:
4883C430 add rsp, 48
5B pop rbx
5E pop rsi
5F pop rdi
C3 ret
RWD00 dq 0000000200000001h, 0000000400000003h
RWD16 dd 000000AFh ; case IG_08
dd 000000C4h ; case IG_09
dd 000000D9h ; case IG_10
dd 000000EEh ; case IG_11
dd 00000116h ; case IG_13
dd 00000103h ; case IG_12
; Total bytes of code 319
|
This would be super helpful. |
Does this PR also cover crossgen2? |
Yes works for R2R. And, probably, for NativeAOT as well but afair it redirects stdout cc @MichalStrehovsky |
One noticeable advantage over sharplab for end-users is the ability to print names for helper calls, e.g. public class C
{
object o;
public void M(object[] a)
{
a[0] = 42;
o = a;
}
} Sharplab:JitDisasm:; Assembly listing for method C:M(ref):this
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
G_M000_IG01: ;; offset=0000H
57 push rdi
56 push rsi
4883EC28 sub rsp, 40
488BF9 mov rdi, rcx
488BF2 mov rsi, rdx
G_M000_IG02: ;; offset=000CH
48B958E4DC2CFD7F0000 mov rcx, 0x7FFD2CDCE458
E835AFB15F call CORINFO_HELP_NEWSFAST
C740082A000000 mov dword ptr [rax+8], 42
4C8BC0 mov r8, rax
488BCE mov rcx, rsi
33D2 xor edx, edx
E891F4FFFF call CORINFO_HELP_ARRADDR_ST
488D4F08 lea rcx, bword ptr [rdi+8]
488BD6 mov rdx, rsi
E875F4E2FF call CORINFO_HELP_ASSIGN_REF
90 nop
G_M000_IG03: ;; offset=003CH
4883C428 add rsp, 40
5E pop rsi
5F pop rdi
C3 ret
; Total bytes of code 67 Perhaps, it will be useful for sharplab too |
@EgorBo do you plan to merge this to .NET 7? |
It probably does not meet the bar but I'd love to merge it in .NET 7.0, but no strong opinion |
FWIW, I would love that, too :) (as long as there's no meaningful perf regressions associated with it and there are no concerns about the data this exposes). |
Yep, we discussed this one too and came to conclusion that there is nothing to be afraid of and our competitors provide this ability too |
NativeAOT compiler and crossgen use exact same infrastructure. If it works for one, it should work for the other one too. You have to specify the option on the command line instead of env variable. Is that correct?
We should not be injecting new life into I am wondering why we need both |
Correct, e.g.
I fully agree, should I still keep NgenDisasm for backward compatibility? |
There is no backward compatibility to be worried about. We should delete all configs that start with If we have any infra that use the I see a few places that clear |
https://godbolt.org/ relies on it to print ASM for C#, but I assume it will only benefit from moving to release .NET as it won't have to compile .NET from sources and use public previews/daily builds of Release .NET instead cc @hez2010 |
2849219
to
98a2a9d
Compare
There is nothing to worry about. We can definitely setup a new net7.0 config for godbolt while keeping net6.0 as is. |
Outerloop pipelines failures seem unrelated:
|
I have started a non-PR run of The change is somewhat low risk for release since when My main concern would be that we have an |
Right, I audited those too (and inserted a failfast if we entered them with .disAsm = false) locally |
@dotnet/jit-contrib @jakobbotsch @BruceForstall PTAL |
Seems like we should allow users to specify If you are live debugging it would be nice to see the actual addresses and such. runtime/src/coreclr/jit/disasm.cpp Lines 1461 to 1467 in e71a958
Might also be nice to be able to control There are a whole bunch more like this: unwind, eh, ... but they might take a lot more work. |
@EgorBo |
Yes, I had to draw the line somewhere for the initial impl since, obviously, supporting all of them is a lot more work, I am going to take a look at |
Good question, I assume some 3rd party tools are going to depend on it (parse) and we should not ever change it |
There could be a separate env var that'd enable a special stable parsable output, something like git does for tools invoking it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. It would be nice if the number of places doing + m_debugInfoSize
were more minimized, via wrapper functions or the like.
It would also be nice if some of the additional controls were available, as previously suggested, like JitStdOutFile, JitDiffableDasm, etc.
@@ -1650,10 +1649,33 @@ class emitter | |||
#endif // MULTIREG_HAS_SECOND_GC_RET | |||
}; | |||
|
|||
// TODO-Cleanup: Uses of stack-allocated instrDescs should be refactored to be unnecessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we eliminate stack-allocated instrDescs? It seems perfectly reasonable to allow them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are creating these temporary instrDesc
simply to be able to call existing methods to emit code/do printing. That seems like a code smell that the codegen/printing methods themselves should be refactored so that we can use the specific parts we want directly instead of having to indirect through an instrDesc
structure. BTW, there's just three uses of this pattern:
- One in emitxarch.cpp that wants to call an emit method. It already has this same cleanup comment.
- One in emitarm.cpp and emitarm64.cpp for printing
LARGEJMP
instructions.
{ | ||
private: | ||
instrDescDebugInfo* idDebugInfo; | ||
alignas(alignof(T)) char idStorage[sizeof(T)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the static_assert_no_msg
below, isn't this alignas
not useful? Or equivalent to alignas(sizeof(instrDescDebugInfo*))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, some future instrDesc
could have larger-than-pointer alignment requirement. That would certainly require some work to support and I just want it to be noisy that this is a place that would need to be fixed.
Started to review, but it seems we are not printing |
I agree with @BruceForstall . There are too many places where we are doing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Consolidate most uses of m_debugInfoSize into these functions.
I pushed a commit that adds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The functions are a nice improvement. |
We get two new IAT entries on Windows: |
Merging now. We're going to keep eyes on TP-related benchmarks (e.g. "Crossgen2 Throughput Trends" page) in coming weeks. Kudos to @jakobbotsch for mitigating all the initial TP/Memory regressions 🎉 |
This PR enables
DOTNET_JitDisasm
for Release config.It doesn't print lcl tables, EH and gc info (for now).
Adds ~30kb to the binary size of jit.
The only noticeable impact on Release is that
instrDesc
struct is now 8bytes bigger (_idDebugOnlyInfo
) - but I might fix it with a standalone hash-table which will be populated only when JitDisasm variable is not empty.Example:
DOTNET_JitDisasm=Test
: (DOTNET_JitDiffableDasm
is0
by default, hence, raw bytes, addresses)