-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float16
test failures when building with GCC 12 and USE_SYSTEM_CSL=0 USE_BINARYBUILDER_CSL=0
#44829
Comments
@nalimilan Could you try to bisect this? I never got anywhere in my investigation |
Well AFAIK it's not a regression in Julia, it's just due to Fedora moving to GCC12. I'll confirm this in a moment [EDIT: I confirm that it also fails on 1.7.2, which builds fine on Fedora 35 with GCC 11]. |
Why is everybody using gcc 12 before the release? Also Debian... |
Well, compiling with gcc12 itself seems not needed here. What I tried now is:
So the problem does not seems to be gcc12-compiled julia binary, but it seems some functions in |
If true, that'd be rather worrying 😕 Do you happen to know whether GCC intends to make this kind of ABI breaking changes? Perhaps it's accidental and it should be reported/fixed |
Right now it's an hypothesis. The issue might be that we shadow a GCC function definition and I am wondering if that leads to a conflict... |
Long comment... Now I compared gcc11 libgcc_s.so symbols and gcc12 libgcc_s.so symbol, and compared to gcc11 one, gcc12 libgcc_s.so adds some new symbols. So I tried to try
which returns True with gcc11 while False with gcc12, with adding breakpoints on all symbols newly added in gcc12 libgcc_s.so. gdb-gcc11.log.txt Note that with gdb-gcc12.log.txt, and while julia prototype is So my guess it when julia sees ''Float16(2.)", julia (or llvm?) tries to interpret as "conversion from double value of 2 to some int16 representation" and generates the code to call |
A quick workaround is not to call libgcc internal |
So (although this may be simply the workaround and the "desired" solution may be different) actually forcely avoiding
inside both https://koji.fedoraproject.org/koji/taskinfo?taskID=85284148 |
clang expects this value to be in %ax when you call this function (_Float16 is not specified in the platform ABI, but it would be a major ABI-breaking change for GCC to put it anywhere else)
|
Thanks @vtjnash for letting me know this. In fact, I'm working on applying the same ABI to features pre avx512fp16. It's true it's ABI breaking, which is a big concern to me too. It is expected we will use the same Could you help me to evaluate the impact to Julia if we change the ABI? Is there anyway to workaround it? Anything I can do with? Thanks! |
The main concern for me was the conditional breakage, since we compile code simultaneously for many different targets. We can usually deal with flag-day type events by runtime detection, but we need it to be detectable in some way then. There seems to be a related ABI problem in llvm / compiler-rt / clang here: https://reviews.llvm.org/D92241, since it is going to create different code based on the target cpu, but llvm CodeGen will have no idea what compiler flags were used to compile the support library, and this may thus result in mis-compilation of code that calls into the runtime support library. |
Looking around further at llvm issues, it looks like we are essentially running into the same ABI problems as were being triggered by attempts to land https://reviews.llvm.org/D114099 to enable this in clang also |
I think we won't have conditional breakage if D114099 and related patches merged. But I have no idea what to do next, given it will expose the problem you described. |
Yes, seems like https://reviews.llvm.org/D107082 is a fix for this issue. We can have 2 copies of our |
A partial fix to #44829, but perhaps we will still have ABI problems with the sysimg (which may not link directly enough on linux in many cases)
A partial fix to JuliaLang#44829
Fixes #44829, until llvm fixes the support for these intrinsics itself
Fixes #44829, until llvm fixes the support for these intrinsics itself
Fixes #44829, until llvm fixes the support for these intrinsics itself
Fixes #44829, until llvm fixes the support for these intrinsics itself
Fixes #44829, until llvm fixes the support for these intrinsics itself
Fixes #44829, until llvm fixes the support for these intrinsics itself Also need to handle vectors, since the vectorizer may have introduced them. Also change our runtime emulation versions to f32 for consistency.
Should be fixed by #45649 |
When building Julia with
make USE_SYSTEM_CSL=0 USE_BINARYBUILDER_CSL=0
and GCC 12 fails when runningFloat16
tests. This prevents shipping the Julia RPM package in Fedora 36 (https://bugzilla.redhat.com/show_bug.cgi?id=2044284).Cc: @vchuravy who had started investigating this
The output is:
The text was updated successfully, but these errors were encountered: