Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/8.0] Fix stack_limit handling #91095

Merged
merged 4 commits into from
Aug 26, 2023

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Aug 24, 2023

Backport of #90937 to release/8.0

/cc @janvorli

Customer Impact

When hardware exception happens on Windows x86, there is a function on the stack that did or is potentially going to do a pinvoke and the related InlinedCallFrame is the topmost explicit frame, the stack limit we use in GC to detect whether an address is on stack or not is a garbage. Depending on the value of the garbage, it might go unnoticed or it might cause a GC hole, because it would falsely dismiss some stack locations as not being on the current stack. That check is used to determine whether a reference is an interior pointer or not. So it may cause GC holes or other problems with GC.

Testing

A targeted repro test, local coreclr / libraries testing, CI tests.

Risk

Low, the fix effectively disables the use of the stack limit extracted from the topmost frame on Windows x86. The stack limit was originally introduced to address a Unix specific issue with dynamic stack size. That is not something that can happen on Windows.

There is a problem with computing stack_limit in the presence of
inactive InlinedCallFrame. We were trying to call GetCallSiteSP on that
frame to get the call site SP. But when the inlined call frame is not
active (there is no call to native code in progress), that method
returns random junk.

But even if we checked for an active call before reading the value, it
would not get correct stack limit on x86 windows when there was no
active call with the inlined call frame being the topmost one. That
can happen with hardware exception handling on Windows x86. On other
targets, we always have other explicit frame on top of the explicit
frames stack, but on windows x86, we don't use the FaultingExceptionFrame for
hardware exceptions, so the inactive inlined call frame could be the
topmost one when GC starts to scan stack roots.

Since the stack_limit was introduced to fix a Unix specific problem, I
have fixed that by disabling the stack_limit usage for x86 windows.

Close #86265
@janvorli janvorli self-assigned this Aug 24, 2023
@janvorli janvorli added the Servicing-consider Issue for next servicing release review label Aug 24, 2023
@janvorli janvorli added this to the 8.0.0 milestone Aug 24, 2023
Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved. once ready this pr can be merged

@jeffschwMSFT jeffschwMSFT added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Aug 25, 2023
@carlossanlop carlossanlop merged commit f0ad1a7 into release/8.0 Aug 26, 2023
111 of 114 checks passed
@carlossanlop carlossanlop deleted the backport/pr-90937-to-release/8.0 branch August 26, 2023 00:16
@ghost ghost locked as resolved and limited conversation to collaborators Sep 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-coreclr Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants