-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ARM64 floating point registers unwinding in PAL #56919
Fix ARM64 floating point registers unwinding in PAL #56919
Conversation
We were not unwinding the non-volatile floating point registers at all (not transferring them between the CONTEXT and ucontext_t before and after the unw_step). That causes crashes on arm64 Unix in some of the tests since JIT now generates code that uses e.g. the D8 register and a runtime code that was throwing an exception was using it too.
cc @echesakovMSFT @briansull fix for the issue that we were looking at. |
Could you please delete this as well: runtime/src/tests/JIT/IL_Conformance/Convert/TestConvertFromIntegral.csproj Lines 10 to 20 in b690695
it was a workaround for this issue. |
I haven't noticed that the libunwind on Linux uses V instead of D in the UNW_AARCH64_* constants. Fixing it now. |
Done |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
How is it that we don't need to unwind the saved floating point registers on other architectures like ARM and X64? |
The test is failing on arm Linux but only in JitStress that I was not checking while merging (I ran only outerloop) there.
but without stress testing we don't allocate d8. So my guess is that we need the same fix for arm. |
x64 Unix ABI doesn't have any non-volatile floating point registers. But for arm, it seems we will need the same fix. |
Interestingly, the Linux arm64 has now many legs failing with this change. I am investigating ... |
And what about ARM64 Windows? |
This is need for Unix. On Windows, unwinding is done using the OS. |
It is odd that we are just now noticing this, as the JIT will often use the non-volatile floating point registers. |
It is also needed to modify a non-volatile FP register in the native code that throws the exception that's propagated to managed code. So I guess the C++ compiler just started doing that at some places. |
Bummer. Libunwind doesn't support unwinding floating point registers on ARM. |
(maybe the future minipal can share mono's unwind implementation so we can get rid of no-gnu libunwind dependency altogether 🙂) |
@am11 IIRC, the mono unwind library is just very minimal one supporting only what mono generates. We need to be able to unwind whatever C++ compiler decides to generate. |
@janvorli, in this case, would it make more sense to use the unwinder from one of the compiler toolchain? If llvm's libunwind is missing context pointers support, and pinning is an option in GC (as used for macOS and FreeBSD, btw, both of which actually use llvm's libunwind), would it cause a significant degradation in quality? In 2013, I think Apple opensourced their system unwinder to llvm and it became llvm libunwind. I am asking because nognu libunwind does not have many consumers and lacks proper/decicated maintainership. In Debian, only three (1,2,3 not 3% 🙂) packages depend on nognu libunwind and even they have option to switch to other unwinders. Distro maintainers of nognu libunwind package, as far as I know (Alpine, Gentoo, Debian/Ubuntu) are not really thrilled about the fact it has longstanding unresolved bugs and the fact that it is complicated implementation and tries to do many things. On the other hand, the compiler toolchains dogfood on their unwinder implementation and they keep the implementation up to date. |
Unfortunately, the LLVM libunwind has one significant deficiency - it doesn't support |
Actually, I have forgotten to mention that it is related to the cases when we actually unwind using the libunwind, which excludes unwinding from managed code frames. So the problem I have mentioned only occurs on native / managed boundaries. |
In the past, I was already thinking about adding support for the unw_get_save_loc to the LLVM libunwind and after taking a brief look, it seemed it would not be that complicated. However, I think would add small penalty to all unwind operations, so I am not sure how difficult it would be to get such a change into the upstream libunwind. As for how to solve the current problem for ARM, I can see several approaches:
|
Hmm, looking at our libunwind, there seems to be actually support for floating point unwinding, just the unw_context_t doesn't contain the FP regs. I need to understand it better then. There is a concept of context and cursor in the libunwind. It seems that the context is used for the initialization while the cursor for walking the stack. So there is a chance that the fix would be easy. |
A few related commits were not included in v1.5:
Can we take their latest master (libunwind/libunwind@c720133) and try out if it solves problems? There were some fixes for x86 and x64 since the last release as well. |
I have looked at the main branch there before and the context still didn't have the FP registers for ARM. |
758ed86
to
91b7bef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix looks good now. I've opened an issue to track fixing this on arm32.
We were not unwinding the non-volatile floating point registers at all
(not transferring them between the CONTEXT and ucontext_t before and
after the unw_step). That causes crashes on arm64 Unix in some of
the tests since JIT now generates code that uses e.g. the D8 register
and a runtime code that was throwing an exception was using it too.
Close #56522