Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix x86 linux tests #57244

Merged
merged 1 commit into from
Sep 18, 2021
Merged

Conversation

t-mustafin
Copy link
Contributor

This PR contains commits which fixes build and runtime tests bugs on linux x86 platform. A majority of them fixes calling convention issues.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Aug 11, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 11, 2021
@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch from 8ce1079 to 6dd78be Compare August 12, 2021 12:33
@t-mustafin
Copy link
Contributor Author

cc @alpencolt @gbalykov

@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch 3 times, most recently from eb31d7b to ea6200e Compare August 18, 2021 04:22
@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch from 052bdae to 100a75b Compare August 23, 2021 20:43
@t-mustafin t-mustafin marked this pull request as ready for review August 24, 2021 13:14
@t-mustafin
Copy link
Contributor Author

cc @jkotas @janvorli

#define STACK_ALIGN_MASK 0xf
mov ebx, esp
sub ebx, 4
and ebx, STACK_ALIGN_MASK
Copy link
Member

@jkotas jkotas Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be needed. The caller should guarantee the right alignment.

Copy link
Contributor Author

@t-mustafin t-mustafin Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that patch don't resolve problem entirely. Question about original problem #58191.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, ebx-method was simple to implement.

@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch from 100a75b to fcb94eb Compare August 27, 2021 16:41
@jkotas
Copy link
Member

jkotas commented Aug 27, 2021

@dotnet/jit-contrib Please take a look at the JIT changes that are bulk of this delta. The non-JIT changes LGTM.

Comment on lines 391 to 402
// Place JIT_WriteBarrierEAX address into stack to call it via ret instuction.
push eax
mov eax, dword ptr [C_FUNC(JIT_WriteBarrierEAX_Loc)]
xchg eax, dword ptr [esp]
// Real return address is still on the stack at esp + 4.
ret
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that the above removed instructions are not trying to keep the hardware's call-ret stack balanced?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no IP-relative computation is needed I think the entire function can just be

mov eax, edx
mov edx, ecx
jmp dword ptr [C_FUNC(JIT_WriteBarrierEAX_Loc)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that the above removed instructions are not trying to keep the hardware's call-ret stack balanced?

Maybe I do not clearly understand the question. I am sure that the above removed instructions keep the hardware's call-ret info on stack balanced. Removed instructions just calculate wrong address.

If no IP-relative computation is needed I think the entire function can just be

mov eax, edx
mov edx, ecx
jmp dword ptr [C_FUNC(JIT_WriteBarrierEAX_Loc)]

Unfortunately this code jumps to JIT_WriteBarrierEAX_Loc, not to the address stored in JIT_WriteBarrierEAX_Loc:

=> 0xf7606c71 <JIT_WriteBarrier_Callable>:      mov    eax,edx
   0xf7606c73 <JIT_WriteBarrier_Callable+2>:    mov    edx,ecx
   0xf7606c75 <JIT_WriteBarrier_Callable+4>:    jmp    0xf7ad49b4
gdb$ disassemble 0xf7ad49b4
Dump of assembler code for function JIT_WriteBarrierEAX_Loc:
   0xf7ad49b4:  enter  0x6075,0xf7
End of assembler dump.
gdb$ disassemble 0xf76075c8
Dump of assembler code for function JIT_WriteBarrierEAX:
   0xf76075c8 <+0>:     mov    DWORD PTR [edx],eax
   0xf76075ca <+2>:     cmp    eax,0xf1a0000c
   0xf76075d0 <+8>:     jb     0xf76075df <JIT_WriteBarrierEAX+23>
   0xf76075d2 <+10>:    shr    edx,0xa
   0xf76075d5 <+13>:    nop
   0xf76075d6 <+14>:    cmp    BYTE PTR [edx-0xa4007dc],0xff
   0xf76075dd <+21>:    jne    0xf76075e2 <JIT_WriteBarrierEAX+26>
   0xf76075df <+23>:    ret    
   0xf76075e0 <+24>:    nop
   0xf76075e1 <+25>:    nop
   0xf76075e2 <+26>:    mov    BYTE PTR [edx-0xa4007dc],0xff
   0xf76075e9 <+33>:    ret 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I do not clearly understand the question. I am sure that the above removed instructions keep the hardware's call-ret info on stack balanced. Removed instructions just calculate wrong address.

The hardware tracks call/ret pairs to be able to branch predict ret instructions better. It means that after this change the hardware might mispredict a lot of the upcoming ret instructions.

Unfortunately this code jumps to JIT_WriteBarrierEAX_Loc, not to the address stored in JIT_WriteBarrierEAX_Loc:

That's odd. Maybe try AT&T syntax? It should assemble to ff 25 imm32.

Copy link
Member

@jakobbotsch jakobbotsch Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reproduce it with clang-9 -m32 foo.S locally. Using AT&T syntax instead works and assembles correctly:

.att_syntax
    jmpl *C_FUNC(JIT_WriteBarrierEAX_Loc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, with GOT we do not need JIT_WriteBarrierEAX_Loc at all. The code below works properly:

    mov eax, edx
    mov edx, ecx
    push    eax
    call    1f
1:
    pop     eax
2:
.att_syntax
    addl     $_GLOBAL_OFFSET_TABLE_+(2b-1b), %eax
.intel_syntax noprefix
    mov     eax, dword ptr [eax + C_FUNC(JIT_WriteBarrierEAX)@GOT]
    xchg    eax, dword ptr [esp]
    ret

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this would work with the W^X enabled. We don't want to call the JIT_WriteBarrierEAX in the executable, but rather its copy that is placed at a random memory location (in a dynamically allocated memory page).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this would work with the W^X enabled. We don't want to call the JIT_WriteBarrierEAX in the executable, but rather its copy that is placed at a random memory location (in a dynamically allocated memory page).

Would current PR variant with JIT_WriteBarrierEAX_Loc dereferencing work with W^X enabled?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if that works with W^X disabled, it would work with it enabled too. The JIT_WriteBarrierEAX_Loc points to the JIT_WriteBarrierEAX with W^X disabled and to its copy with W^X enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, JIT_WriteBarrierEAX_Callable is called in every simple program and now it works with current PR on x86 linux.

src/coreclr/inc/corinfo.h Outdated Show resolved Hide resolved
src/coreclr/jit/reglist.h Show resolved Hide resolved
src/coreclr/vm/i386/asmhelpers.S Outdated Show resolved Hide resolved
src/coreclr/vm/i386/asmhelpers.asm Outdated Show resolved Hide resolved
@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch from ae3c64f to 8a1bae9 Compare August 31, 2021 15:21
@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch from 8a1bae9 to 84f81ba Compare September 3, 2021 15:28
[unix x86] Fix tests build
[unix x86] Add register map for crossgen2
            Fixes readytorun/coreroot_determinism/coreroot_determinism/coreroot_determinism.sh on x86 linux
[unix x86] Fix tail calls tests
[unix x86] Fix unmanaged callconv
[unix x86] Fix passing implicit args via stack
[unix x86] Pop hidden retbuff arg on cdecl callconv
[unix x86] Fix WriteBarrier call

[x86] Add calling convention name print to dump
[x86] Use ebx to pass VASigCookie to GenericPInvokeCalliHelper
            It fixes stack alignment in GenericPInvokeCalliHelper on unix x86

Fix storageType overflow assertion in TinyArray
@t-mustafin t-mustafin force-pushed the Fix_x86_linux_tests_build branch from 84f81ba to caa59b9 Compare September 3, 2021 15:31
@t-mustafin t-mustafin requested a review from janvorli September 3, 2021 15:35
@BruceForstall BruceForstall merged commit 7263ce5 into dotnet:main Sep 18, 2021
@t-mustafin t-mustafin deleted the Fix_x86_linux_tests_build branch September 20, 2021 07:36
@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants