Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix return address hijacking in the stack probing loop #28119

Merged
merged 1 commit into from
Jan 15, 2021

Conversation

janvorli
Copy link
Member

When return address hijacking occurs when the target thread is running
in the stack probing loop in a method with large frame, the unwinder cannot
unwind to the caller frame correctly. That results in a wrong stack slot
being patched by the modified return address, leading to corruption of
locals of the method being executed.

This change fixes the problem by not attempting to hijack a method that's
running in prolog.

This change is not a port, as the .NET 5.0 uses a different method to do stack
probing. That method would require a too large change to be ported though,
so this is a targeted fix.

Customer impact

Customer (RavenDB) database servers processing customer data are crashing on a regular basis (few times a week) with SIGSEGV.
There is no workaround for the problem.
See dotnet/runtime#42885 for more details.

Regression?

No

Testing

Customer testing a custom libcoreclr.so based on 3.1.8 containing this fix. They were testing the change on their production servers for about a month. The intermittent crashes they were seeing on a regular basis have disappeared and their systems were working flawlessly.

Risk

Low, the change just prevents return address hijacking to happen in prolog of methods with large frames. Hijacking attempt is not required to succeed, there are other cases when we cannot hijack a method already, like when running in an epilog, when executing native code etc.

When return address hijacking occurs when the target thread is running
in the stack probing loop in a method with large frame, the unwinder cannot
unwind to the caller frame correctly. That results in a wrong stack slot
being patched by the modified return address, leading to corruption of
locals of the method being executed.

This change fixes the problem by not attempting to hijack a method that's
running in prolog.
@janvorli janvorli added area-VM Servicing-consider Issue for next servicing release review labels Jan 11, 2021
@janvorli janvorli added this to the 3.1.x milestone Jan 11, 2021
@janvorli janvorli requested a review from jkotas January 11, 2021 17:14
@janvorli janvorli self-assigned this Jan 11, 2021
Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Please get a cr and we can take for servicing this week.

@janvorli
Copy link
Member Author

There is some missing package problem unrelated to this change:

2021-01-11T17:39:35.4836756Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102: Unable to find package Microsoft.NETCore.CoreDisTools with version (>= 1.0.1-prerelease-00005) [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4837927Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 1 version(s) in dotnet-core [ Nearest version: 1.0.1-prerelease-00001 ] [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4839117Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in nuget.org [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4840200Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in myget.org dotnet-core [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4841237Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in dotnet-coreclr [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.5359312Z   Restore failed in 2.61 sec for /__w/1/s/src/tools/r2rdump/R2RDump.csproj.

I wonder why the 1.0.1-prerelease-00005 version disappeared. @echesakovMSFT do you know if we have somehow disabled the previous source of the package after your change that created the new version 00006?

@echesakov
Copy link

There is some missing package problem unrelated to this change:

2021-01-11T17:39:35.4836756Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102: Unable to find package Microsoft.NETCore.CoreDisTools with version (>= 1.0.1-prerelease-00005) [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4837927Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 1 version(s) in dotnet-core [ Nearest version: 1.0.1-prerelease-00001 ] [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4839117Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in nuget.org [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4840200Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in myget.org dotnet-core [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4841237Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in dotnet-coreclr [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.5359312Z   Restore failed in 2.61 sec for /__w/1/s/src/tools/r2rdump/R2RDump.csproj.

I wonder why the 1.0.1-prerelease-00005 version disappeared. @echesakovMSFT do you know if we have somehow disabled the previous source of the package after your change that created the new version 00006?

@janvorli Hmm, I don't know. We shouldn't delete them, since these packages are used by 5.0 and 3.1

@hoyosjs
Copy link
Member

hoyosjs commented Jan 12, 2021

I can only find said preview 5 package in the dotnet5-trasport feed. My guess is it lived in the dotnet-core myget feed which was shutdown earlier today/yesterday. Not sure it would be safe to add a feed like Microsoft.NETCore.CoreDisTools @ dotnet5-transport to a 3.1 release branch. (cc: @mmitche, @garath, @clairernovotny)

@garath
Copy link
Member

garath commented Jan 12, 2021

My guess is it lived in the dotnet-core myget feed which was shutdown earlier today/yesterday.

Looks like that's correct. The dotnet-core feed from dotnet.myget.org is now available as a sleet feed. That should work instead.

If it makes sense and @mmitche agrees, I could manually publish it to the dotnet3.1-transport feed.

@leecow leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Jan 12, 2021
@leecow leecow modified the milestones: 3.1.x, 3.1.12 Jan 12, 2021
@hoyosjs
Copy link
Member

hoyosjs commented Jan 15, 2021

/azp run coreclr-ci

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@Anipik Anipik merged commit fa682a7 into dotnet:release/3.1 Jan 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants