Fix return address hijacking in the stack probing loop #28119

janvorli · 2021-01-11T17:14:52Z

When return address hijacking occurs when the target thread is running
in the stack probing loop in a method with large frame, the unwinder cannot
unwind to the caller frame correctly. That results in a wrong stack slot
being patched by the modified return address, leading to corruption of
locals of the method being executed.

This change fixes the problem by not attempting to hijack a method that's
running in prolog.

This change is not a port, as the .NET 5.0 uses a different method to do stack
probing. That method would require a too large change to be ported though,
so this is a targeted fix.

Customer impact

Customer (RavenDB) database servers processing customer data are crashing on a regular basis (few times a week) with SIGSEGV.
There is no workaround for the problem.
See dotnet/runtime#42885 for more details.

Regression?

No

Testing

Customer testing a custom libcoreclr.so based on 3.1.8 containing this fix. They were testing the change on their production servers for about a month. The intermittent crashes they were seeing on a regular basis have disappeared and their systems were working flawlessly.

Risk

Low, the change just prevents return address hijacking to happen in prolog of methods with large frames. Hijacking attempt is not required to succeed, there are other cases when we cannot hijack a method already, like when running in an epilog, when executing native code etc.

When return address hijacking occurs when the target thread is running in the stack probing loop in a method with large frame, the unwinder cannot unwind to the caller frame correctly. That results in a wrong stack slot being patched by the modified return address, leading to corruption of locals of the method being executed. This change fixes the problem by not attempting to hijack a method that's running in prolog.

jeffschwMSFT

Approved. Please get a cr and we can take for servicing this week.

janvorli · 2021-01-11T18:22:23Z

There is some missing package problem unrelated to this change:

2021-01-11T17:39:35.4836756Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102: Unable to find package Microsoft.NETCore.CoreDisTools with version (>= 1.0.1-prerelease-00005) [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4837927Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 1 version(s) in dotnet-core [ Nearest version: 1.0.1-prerelease-00001 ] [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4839117Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in nuget.org [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4840200Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in myget.org dotnet-core [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4841237Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in dotnet-coreclr [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.5359312Z   Restore failed in 2.61 sec for /__w/1/s/src/tools/r2rdump/R2RDump.csproj.

I wonder why the 1.0.1-prerelease-00005 version disappeared. @echesakovMSFT do you know if we have somehow disabled the previous source of the package after your change that created the new version 00006?

echesakov · 2021-01-11T18:40:24Z

There is some missing package problem unrelated to this change:

2021-01-11T17:39:35.4836756Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102: Unable to find package Microsoft.NETCore.CoreDisTools with version (>= 1.0.1-prerelease-00005) [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4837927Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 1 version(s) in dotnet-core [ Nearest version: 1.0.1-prerelease-00001 ] [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4839117Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in nuget.org [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4840200Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in myget.org dotnet-core [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.4841237Z /__w/1/s/src/tools/r2rdump/R2RDump.csproj : error NU1102:   - Found 0 version(s) in dotnet-coreclr [/__w/1/s/src/build.proj]
2021-01-11T17:39:35.5359312Z   Restore failed in 2.61 sec for /__w/1/s/src/tools/r2rdump/R2RDump.csproj.

I wonder why the 1.0.1-prerelease-00005 version disappeared. @echesakovMSFT do you know if we have somehow disabled the previous source of the package after your change that created the new version 00006?

@janvorli Hmm, I don't know. We shouldn't delete them, since these packages are used by 5.0 and 3.1

hoyosjs · 2021-01-12T02:57:01Z

I can only find said preview 5 package in the dotnet5-trasport feed. My guess is it lived in the dotnet-core myget feed which was shutdown earlier today/yesterday. Not sure it would be safe to add a feed like Microsoft.NETCore.CoreDisTools @ dotnet5-transport to a 3.1 release branch. (cc: @mmitche, @garath, @clairernovotny)

garath · 2021-01-12T04:36:58Z

My guess is it lived in the dotnet-core myget feed which was shutdown earlier today/yesterday.

Looks like that's correct. The dotnet-core feed from dotnet.myget.org is now available as a sleet feed. That should work instead.

If it makes sense and @mmitche agrees, I could manually publish it to the dotnet3.1-transport feed.

hoyosjs · 2021-01-15T20:11:08Z

/azp run coreclr-ci

azure-pipelines · 2021-01-15T20:11:20Z

Azure Pipelines successfully started running 1 pipeline(s).

janvorli added area-VM Servicing-consider Issue for next servicing release review labels Jan 11, 2021

janvorli added this to the 3.1.x milestone Jan 11, 2021

janvorli requested a review from jkotas January 11, 2021 17:14

janvorli self-assigned this Jan 11, 2021

jeffschwMSFT approved these changes Jan 11, 2021

View reviewed changes

jkotas approved these changes Jan 11, 2021

View reviewed changes

leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Jan 12, 2021

leecow modified the milestones: 3.1.x, 3.1.12 Jan 12, 2021

Anipik merged commit fa682a7 into dotnet:release/3.1 Jan 15, 2021

jkotas mentioned this pull request Feb 23, 2021

segfault error 4 in libcoreclr.so dotnet/runtime#42885

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix return address hijacking in the stack probing loop #28119

Fix return address hijacking in the stack probing loop #28119

janvorli commented Jan 11, 2021

jeffschwMSFT left a comment

janvorli commented Jan 11, 2021

echesakov commented Jan 11, 2021

hoyosjs commented Jan 12, 2021

garath commented Jan 12, 2021

hoyosjs commented Jan 15, 2021

azure-pipelines bot commented Jan 15, 2021

Fix return address hijacking in the stack probing loop #28119

Fix return address hijacking in the stack probing loop #28119

Conversation

janvorli commented Jan 11, 2021

Customer impact

Regression?

Testing

Risk

jeffschwMSFT left a comment

Choose a reason for hiding this comment

janvorli commented Jan 11, 2021

echesakov commented Jan 11, 2021

hoyosjs commented Jan 12, 2021

garath commented Jan 12, 2021

hoyosjs commented Jan 15, 2021

azure-pipelines bot commented Jan 15, 2021