Skip to content

Conversation

zengandrew
Copy link
Contributor

On SysV AMD64, structs returned in a float and int reg pair were being classified as RT_Scalar_XX. This causes downstream consumers (e.g., HijackFrame::GcScanRoots) to look for obj/byref's in the second int reg. Per the ABI, however, the first float is passed through a float reg and the obj/byref is passed through the first int reg. We now detect and fix this case by skipping the first float type in the ReturnKind encoding and moving the second type into the first.

Fix #115815

On SysV AMD64, structs returned in a float and int reg pair were being
classified as RT_Scalar_XX. This causes downstream consumers
(e.g., HijackFrame::GcScanRoots) to look for obj/byref's in
the second int reg. Per the ABI, however, the first float is passed
through a float reg and the obj/byref is passed through the _first_
int reg. We now detect and fix this case by skipping the first float
type in the ReturnKind encoding and moving the second type into the
first.

Fix dotnet#115815
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 1, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 1, 2025
@zengandrew
Copy link
Contributor Author

@dotnet-policy-service agree

@jkotas jkotas added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 1, 2025
return GetStructReturnKind(VarTypeToReturnKind(retTypeDesc.GetReturnRegType(0)),
VarTypeToReturnKind(retTypeDesc.GetReturnRegType(1)));
{
var_types first = retTypeDesc.GetReturnRegType(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right fix for backport. In main, this path is unreachable in main - the whole method can be under TARGET_X86 ifdef and simplified to only handle a single register.

@jakobbotsch How would you like to handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zengandrew Thanks so much for filing the fix here. I have cherry-picked the commit from this PR directly into the two backports (#116206, #116208).

Given the testing it appears to be unnecessary for the 9.0 backport. However 9.0 still has the code around return kinds enabled for x64. That code was removed in #110799 which is not part of 9.0. The actual PRs that made it unnecessary are #95565 and #104336 that are part of 9.0. But, just to be safe, since the code is actually running during hijack, I think we should include it as well.

For main I think we should skip the change (mainly to avoid the misleading commit history, since this code is unused). And indeed we should clean this up (cc @VSadov if you want to do it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed a commit with the cleanup

Comment on lines 1303 to 1309
if (eeClass->GetEightByteClassification(0) == SystemVClassificationTypeSSE)
{
// Skip over SSE types since they do not consume integer registers.
// An obj/byref in the 2nd eight bytes will be in the first integer register.
regKinds[0] = regKinds[1];
regKinds[1] = RT_Scalar;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can ParseReturnKindFromSig be simplified too? It is only used by MethodDesc::GetReturnKind and that one has only uses from GCStress (which is under TARGET_X86) and CLRToCOMMethodFrame::GcScanRoots_Impl (which does not try to handle the multireg cases, but with a NYI assert.).

The ReturnKind enum could probably be cleaned up too after that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ReturnKind enum could probably be cleaned up too after that.

GCInfo decoder source is shared with diagnostic repo. It needs to support older versions of the format. The no longer used ReturnKind values need to stay for that.

Can ParseReturnKindFromSig be simplified too?

Done

}
#endif // UNIX_AMD64_ABI

if (pReturnTypeMT->ContainsGCPointers() || pReturnTypeMT->IsByRefLike())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have pReturnTypeMT->IsByRefLike(), should we return RT_ByRef ?
Or, if we cannot distinguish RT_Object from RT_ByRef, would RT_ByRef be safer to return?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does pReturnTypeMT->IsByRefLike() actually imply that there must be a GC ref?

I think at C# level you can make any struct to be stack-only, it does not need to contain pointers or refs. If IsByRefLike() is true for all ref structs, there may not be a pointer.

Perhaps
pReturnTypeMT->ContainsGCPointers() should return RT_Object and
pReturnTypeMT->ContainsGCPointers() && pReturnTypeMT->IsByRefLike() should return RT_ByRef

Or maybe for simplicity just return RT_ByRef for pReturnTypeMT->ContainsGCPointers()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not the part that is being fixed, but since we are touching this, and it looks a bit suspicious,...

Copy link
Member

@jkotas jkotas Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have done more cleanup. This method is only used for GC stress on x86 now. GC stress can tolerate RT_Illegal return (it did that before this change as well, so it is not a coverage regression).

@jkotas
Copy link
Member

jkotas commented Jun 8, 2025

/azp run runtime-coreclr gcstress0x3-gcstress0xc

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas jkotas force-pushed the zengandrew-fix-115815 branch from c496662 to dbd7587 Compare June 9, 2025 06:54
@jkotas
Copy link
Member

jkotas commented Jun 9, 2025

/azp run runtime-coreclr gcstress0x3-gcstress0xc

@jkotas jkotas requested a review from VSadov June 9, 2025 06:58
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Jun 9, 2025

/azp run runtime-coreclr gcstress0x3-gcstress0xc

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas jkotas merged commit bc6f4b0 into dotnet:main Jun 9, 2025
121 of 123 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jul 10, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reproducible segfaults during GC on .NET 8/9 Linux x64
5 participants