-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix x86 linux tests #57244
Fix x86 linux tests #57244
Conversation
8ce1079
to
6dd78be
Compare
eb31d7b
to
ea6200e
Compare
052bdae
to
100a75b
Compare
src/coreclr/vm/i386/umthunkstub.S
Outdated
#define STACK_ALIGN_MASK 0xf | ||
mov ebx, esp | ||
sub ebx, 4 | ||
and ebx, STACK_ALIGN_MASK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed. The caller should guarantee the right alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that patch don't resolve problem entirely. Question about original problem #58191.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, ebx
-method was simple to implement.
100a75b
to
fcb94eb
Compare
@dotnet/jit-contrib Please take a look at the JIT changes that are bulk of this delta. The non-JIT changes LGTM. |
src/coreclr/vm/i386/jithelp.S
Outdated
// Place JIT_WriteBarrierEAX address into stack to call it via ret instuction. | ||
push eax | ||
mov eax, dword ptr [C_FUNC(JIT_WriteBarrierEAX_Loc)] | ||
xchg eax, dword ptr [esp] | ||
// Real return address is still on the stack at esp + 4. | ||
ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that the above removed instructions are not trying to keep the hardware's call-ret stack balanced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no IP-relative computation is needed I think the entire function can just be
mov eax, edx
mov edx, ecx
jmp dword ptr [C_FUNC(JIT_WriteBarrierEAX_Loc)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that the above removed instructions are not trying to keep the hardware's call-ret stack balanced?
Maybe I do not clearly understand the question. I am sure that the above removed instructions keep the hardware's call-ret info on stack balanced. Removed instructions just calculate wrong address.
If no IP-relative computation is needed I think the entire function can just be
mov eax, edx mov edx, ecx jmp dword ptr [C_FUNC(JIT_WriteBarrierEAX_Loc)]
Unfortunately this code jumps to JIT_WriteBarrierEAX_Loc
, not to the address stored in JIT_WriteBarrierEAX_Loc
:
=> 0xf7606c71 <JIT_WriteBarrier_Callable>: mov eax,edx
0xf7606c73 <JIT_WriteBarrier_Callable+2>: mov edx,ecx
0xf7606c75 <JIT_WriteBarrier_Callable+4>: jmp 0xf7ad49b4
gdb$ disassemble 0xf7ad49b4
Dump of assembler code for function JIT_WriteBarrierEAX_Loc:
0xf7ad49b4: enter 0x6075,0xf7
End of assembler dump.
gdb$ disassemble 0xf76075c8
Dump of assembler code for function JIT_WriteBarrierEAX:
0xf76075c8 <+0>: mov DWORD PTR [edx],eax
0xf76075ca <+2>: cmp eax,0xf1a0000c
0xf76075d0 <+8>: jb 0xf76075df <JIT_WriteBarrierEAX+23>
0xf76075d2 <+10>: shr edx,0xa
0xf76075d5 <+13>: nop
0xf76075d6 <+14>: cmp BYTE PTR [edx-0xa4007dc],0xff
0xf76075dd <+21>: jne 0xf76075e2 <JIT_WriteBarrierEAX+26>
0xf76075df <+23>: ret
0xf76075e0 <+24>: nop
0xf76075e1 <+25>: nop
0xf76075e2 <+26>: mov BYTE PTR [edx-0xa4007dc],0xff
0xf76075e9 <+33>: ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I do not clearly understand the question. I am sure that the above removed instructions keep the hardware's call-ret info on stack balanced. Removed instructions just calculate wrong address.
The hardware tracks call
/ret
pairs to be able to branch predict ret
instructions better. It means that after this change the hardware might mispredict a lot of the upcoming ret
instructions.
Unfortunately this code jumps to JIT_WriteBarrierEAX_Loc, not to the address stored in JIT_WriteBarrierEAX_Loc:
That's odd. Maybe try AT&T syntax? It should assemble to ff 25 imm32
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can reproduce it with clang-9 -m32 foo.S
locally. Using AT&T syntax instead works and assembles correctly:
.att_syntax
jmpl *C_FUNC(JIT_WriteBarrierEAX_Loc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, with GOT we do not need JIT_WriteBarrierEAX_Loc
at all. The code below works properly:
mov eax, edx
mov edx, ecx
push eax
call 1f
1:
pop eax
2:
.att_syntax
addl $_GLOBAL_OFFSET_TABLE_+(2b-1b), %eax
.intel_syntax noprefix
mov eax, dword ptr [eax + C_FUNC(JIT_WriteBarrierEAX)@GOT]
xchg eax, dword ptr [esp]
ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this would work with the W^X enabled. We don't want to call the JIT_WriteBarrierEAX in the executable, but rather its copy that is placed at a random memory location (in a dynamically allocated memory page).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this would work with the W^X enabled. We don't want to call the JIT_WriteBarrierEAX in the executable, but rather its copy that is placed at a random memory location (in a dynamically allocated memory page).
Would current PR variant with JIT_WriteBarrierEAX_Loc dereferencing work with W^X enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if that works with W^X disabled, it would work with it enabled too. The JIT_WriteBarrierEAX_Loc points to the JIT_WriteBarrierEAX with W^X disabled and to its copy with W^X enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, JIT_WriteBarrierEAX_Callable is called in every simple program and now it works with current PR on x86 linux.
ae3c64f
to
8a1bae9
Compare
8a1bae9
to
84f81ba
Compare
[unix x86] Fix tests build [unix x86] Add register map for crossgen2 Fixes readytorun/coreroot_determinism/coreroot_determinism/coreroot_determinism.sh on x86 linux [unix x86] Fix tail calls tests [unix x86] Fix unmanaged callconv [unix x86] Fix passing implicit args via stack [unix x86] Pop hidden retbuff arg on cdecl callconv [unix x86] Fix WriteBarrier call [x86] Add calling convention name print to dump [x86] Use ebx to pass VASigCookie to GenericPInvokeCalliHelper It fixes stack alignment in GenericPInvokeCalliHelper on unix x86 Fix storageType overflow assertion in TinyArray
84f81ba
to
caa59b9
Compare
This PR contains commits which fixes build and runtime tests bugs on linux x86 platform. A majority of them fixes calling convention issues.