-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Fix placement of GT_START_NOGC
for tailcalls in face of bulk copy with write barrier calls
#105551
Conversation
…opy with write barrier calls When the JIT generates code for a tailcall it must generate code to write the arguments into the incoming parameter area. Since the GC ness of the arguments of the tailcall may not match the GC ness of the parameters, we have to disable GC before we start writing these. This is done by finding the earliest `GT_PUTARG_STK` node and placing the start of the NOGC region right before it. In addition, there is logic to take care of potential overlap between the arguments and parameters. For example, if the call has an operand that uses one of the parameters, then we must take care that we do not override that parameter with the tailcall argument before the use of it. To do so, we sometimes may need to introduce copies from the parameter locals to locals on the stack frame. This used to work fine, however, with dotnet#101761 we started transforming block copies into managed calls in certain scenarios. It was possible for the JIT to decide to introduce a copy to a local and for this transformation to then kick in. This would cause us to end up with the managed helper call after starting the nogc region. In checked builds this would hit an assert during GC scan; in release builds, it would end up with corrupted data. The fix here is to make sure we insert the `GT_START_NOGC` after all the potential temporary copies we may introduce as part of the tailcat stll logic. There was an additional assumption that the first `PUTARG_STK` operand was the earliest one in execution order. That is not guaranteed, so this change stops relying on that as well by introducing a new `LIR::FirstNode` and using that to determine the earliest `PUTARG_STK` node. Fix dotnet#102370 Fix dotnet#104123 Fix dotnet#105441
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Any idea why the TailCallOpt config is set up how it is? Seems like we could use an enable range there to allow bisecting in cases like we've seen recently.
/backport to release/9.0-preview7 |
Started backporting to release/9.0-preview7: https://github.com/dotnet/runtime/actions/runs/10115236657 |
No idea.. I also don't know why we have both For bisections related to optimizations I usually find that |
When the JIT generates code for a tailcall it must generate code to write the arguments into the incoming parameter area. Since the GC ness of the arguments of the tailcall may not match the GC ness of the parameters, we have to disable GC before we start writing these. This is done by finding the earliest
GT_PUTARG_STK
node and placing the start of the NOGC region right before it.In addition, there is logic to take care of potential overlap between the arguments and parameters. For example, if the call has an operand that uses one of the parameters, then we must take care that we do not override that parameter with the tailcall argument before the use of it. To do so, we sometimes may need to introduce copies from the parameter locals to locals on the stack frame.
This used to work fine, however, with #101761 we started transforming block copies into managed calls in certain scenarios. It was possible for the JIT to decide to introduce a copy to a local and for this transformation to then kick in. This would cause us to end up with the managed helper call after starting the nogc region. In checked builds this would hit an assert during GC scan; in release builds, it would end up with corrupted data.
The fix here is to make sure we insert the
GT_START_NOGC
after all the potential temporary copies we may introduce as part of the tailcall logic.There was an additional assumption that the first
PUTARG_STK
operand was the earliest one in execution order. That is not guaranteed, so this change stops relying on that as well by introducing a newLIR::FirstNode
and using that to determine the earliestPUTARG_STK
node.Fix #102370
Fix #104123
Fix #105441
I will backport this to preview 7. For preview 6, a workaround of setting
DOTNET_TailCallOpt=0
andDOTNET_ReadyToRun=0
can be utilized.Codegen diff in a test case: