Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fix x86 stack probing #23881

Merged
merged 3 commits into from
Apr 12, 2019
Merged

Conversation

BruceForstall
Copy link
Member

On x86, structs are passed by value on the stack. We copy structs
to the stack in various ways, but one way is to first subtract the
size of the struct and then use a "rep movsb" instruction. If the
struct we are passing is sufficiently large, this can cause us to
miss the stack guard page.

So, introduce stack probes for these struct copies.

It turns out the stack pointer after prolog probing can be sitting
near the very end of the guard page (one STACK_ALIGN slot before
the end, which allows a "call" instruction which pushes its
return address to touch the guard page with the return address push).
We don't want to probe with every argument push, though. So change
the prolog probing to insert an "extra" touch at the final SP location
if the previous touch was "too far" away, leaving at least some
buffer zone for un-probed SP adjustments. I chose this to be the
size of the largest SIMD register, which also can get copied to the
argument stack with a "SUB;MOV" sequence.

Added several test case variations showing different large stack
probe situations.

Fixes #23796

On x86, structs are passed by value on the stack. We copy structs
to the stack in various ways, but one way is to first subtract the
size of the struct and then use a "rep movsb" instruction. If the
struct we are passing is sufficiently large, this can cause us to
miss the stack guard page.

So, introduce stack probes for these struct copies.

It turns out the stack pointer after prolog probing can be sitting
near the very end of the guard page (one `STACK_ALIGN` slot before
the end, which allows a "call" instruction which pushes its
return address to touch the guard page with the return address push).
We don't want to probe with every argument push, though. So change
the prolog probing to insert an "extra" touch at the final SP location
if the previous touch was "too far" away, leaving at least some
buffer zone for un-probed SP adjustments. I chose this to be the
size of the largest SIMD register, which also can get copied to the
argument stack with a "SUB;MOV" sequence.

Added several test case variations showing different large stack
probe situations.

Fixes #23796
@BruceForstall
Copy link
Member Author

@dotnet-bot test Windows_NT x64 Checked pmi_asm_diffs
@dotnet-bot test Windows_NT x86 Checked pmi_asm_diffs

@BruceForstall
Copy link
Member Author

@dotnet-bot test Windows_NT x86 Checked pmi_asm_diffs

@BruceForstall
Copy link
Member Author

@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test

@BruceForstall
Copy link
Member Author

No diffs on x64; no interesting diffs on x86 (I did swap the order of two instructions for constant localloc probes, but the order doesn't affect semantics.) -- for Windows PMI frameworks/benchmarks diffs.

@BruceForstall
Copy link
Member Author

@briansull PTAL
cc @dotnet/jit-contrib

@BruceForstall
Copy link
Member Author

@dotnet-bot test Windows_NT arm Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT x86 Checked no_tiered_compilation

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@BruceForstall BruceForstall merged commit d586523 into dotnet:master Apr 12, 2019
@BruceForstall BruceForstall deleted the FixX86ArgStackProbe branch April 12, 2019 23:55
@CarolEidt CarolEidt mentioned this pull request Apr 14, 2019
@sandreenko
Copy link

I think we should not merge any new tests as Pri1, at least until we get a simple option to trigger Pri1 tests in a pr. I usually push them as pri0 and change to pri1 before the merge (without waiting the actual testing for the change that only changes the priority).

These new tests are failing on many platforms and our weekend testing is all red.

Example: outerloop testing for this PR:

Windows_NT x64 Checked @ Windows.10.Amd64.Open
JIT\\Methodical\\largeframes\\skip4\\skippage4\\skippage4.cmd

\nUnhandled Exception: System.TypeLoadException: Could not load type 'BigFrames.LargeStructWithRef' from assembly 'skippage4, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' because it contains an object field at offset 65500 that is incorrectly aligned or overlapped by a non-object field.\r\n at BigFrames.Test.Test1()\r\n at BigFrames.Test.Main()\n\n\nReturn code: 1\nRaw output file: C:\\dotnetbuild\\work\\d2556ec8-bab3-427a-bde9-68cd3b557d33\\Work\\15d05293-889f-4300-a66b-6fbbdfa592f3\\Exec\\JIT\\Methodical\\Reports\\JIT.Methodical\\largeframes\\skip4\\skippage4\\skippage4.output.txt\nRaw output:\nBEGIN EXECUTION\r\n \"C:\\dotnetbuild\\work\\d2556ec8-bab3-427a-bde9-68cd3b557d33\\Payload\\corerun.exe\" skippage4.exe \r\nExpected: 100\r\nActual: -532462766\r\nEND EXECUTION - FAILED\r\nFAILED\r\nTest Harness Exitcode is : 1\r\n\nTo run the test:\n> set CORE_ROOT=C:\\dotnetbuild\\work\\d2556ec8-bab3-427a-bde9-68cd3b557d33\\Payload\n> C:\\dotnetbuild\\work\\d2556ec8-bab3-427a-bde9-68cd3b557d33\\Work\\15d05293-889f-4300-a66b-6fbbdfa592f3\\Exec\\JIT\\Methodical\\largeframes\\skip4\\skippage4\\skippage4.cmd\n\r\nExpected: True\r\nActual: False

@BruceForstall
Copy link
Member Author

Fixed: #23996

I thought there already was a "azp ... outerloop" option. (I didn't trigger it, and probably should have.) Unfortunately, I only triggered x86 Pri-1, and the failure was with the new test on 64-bit.

@sandreenko
Copy link

I thought there already was a "azp ... outerloop" option.

My bad, it still exists. I thought it was disabled from the jobs that can be triggered with "azp r_u_n" together with jitstress and gcstress jobs, but apparently it was not.

davmason pushed a commit to davmason/coreclr that referenced this pull request Apr 27, 2019
* Fix x86 stack probing

On x86, structs are passed by value on the stack. We copy structs
to the stack in various ways, but one way is to first subtract the
size of the struct and then use a "rep movsb" instruction. If the
struct we are passing is sufficiently large, this can cause us to
miss the stack guard page.

So, introduce stack probes for these struct copies.

It turns out the stack pointer after prolog probing can be sitting
near the very end of the guard page (one `STACK_ALIGN` slot before
the end, which allows a "call" instruction which pushes its
return address to touch the guard page with the return address push).
We don't want to probe with every argument push, though. So change
the prolog probing to insert an "extra" touch at the final SP location
if the previous touch was "too far" away, leaving at least some
buffer zone for un-probed SP adjustments. I chose this to be the
size of the largest SIMD register, which also can get copied to the
argument stack with a "SUB;MOV" sequence.

Added several test case variations showing different large stack
probe situations.

Fixes #23796

* Increase the argument size probe buffer

* Formatting
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Fix x86 stack probing

On x86, structs are passed by value on the stack. We copy structs
to the stack in various ways, but one way is to first subtract the
size of the struct and then use a "rep movsb" instruction. If the
struct we are passing is sufficiently large, this can cause us to
miss the stack guard page.

So, introduce stack probes for these struct copies.

It turns out the stack pointer after prolog probing can be sitting
near the very end of the guard page (one `STACK_ALIGN` slot before
the end, which allows a "call" instruction which pushes its
return address to touch the guard page with the return address push).
We don't want to probe with every argument push, though. So change
the prolog probing to insert an "extra" touch at the final SP location
if the previous touch was "too far" away, leaving at least some
buffer zone for un-probed SP adjustments. I chose this to be the
size of the largest SIMD register, which also can get copied to the
argument stack with a "SUB;MOV" sequence.

Added several test case variations showing different large stack
probe situations.

Fixes dotnet/coreclr#23796

* Increase the argument size probe buffer

* Formatting


Commit migrated from dotnet/coreclr@d586523
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Passing large structs by value asserts/crashes on ARM32/x86
3 participants