Emit aliases to FP16 conversion routines #45649

vchuravy · 2022-06-11T21:30:18Z

Prefix Float16 intrinsics
Define aliases to FP16 crt in the OJIT

Instead of replacing them late in codegen let LLVM emit these symbols,
but intercept them in the ORC JIT.

I haven't had a chance to test this properly and it is likely that
we will need to emit these aliases also into the system-image since
loading that will not see these aliases here.

src/jitlayers.cpp

vchuravy · 2022-06-12T14:46:02Z

Arrg for whatever reason one can't create aliases to things outside the compilation unit... that's disappointing.

One alternative is --defsym/linker script but that is rather getting ugly...

vchuravy · 2022-06-12T23:26:28Z

Okay the error on AArch64 is interesting:

ror in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:174
  Expression: extendhfsf2(Float16(3.3)) == 3.3007812f0
   Evaluated: -0.00091171265f0 == 3.3007812f0
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:175
  Expression: gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
   Evaluated: -0.0009498596f0 == 3.3007812f0
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:176
  Expression: truncsfhf2(3.3f0) == Float16(3.3)
   Evaluated: Float16(0.225) == Float16(3.3)
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:177
  Expression: gnu_f2h_ieee(3.3f0) == Float16(3.3)
   Evaluated: Float16(0.225) == Float16(3.3)
Error in testset intrinsics:
Test Failed at /buildworker/worker/tester_linuxaarch64/build/share/julia/test/intrinsics.jl:178
  Expression: truncdfhf2(3.3) == Float16(3.3)
   Evaluated: Float16(0.225) == Float16(3.3)

I am testing here explicitly the Int16 ABI and it seems like on aarch64 it doesn't apply, which kinda makes sense. There GCC had _Float16 forever.

vchuravy · 2022-06-13T20:17:20Z

This looks good to me. @vtjnash would appreciate a quick review

KristofferC · 2022-06-13T20:31:38Z

This looks good to me

Buildbot Windows disagrees?

vchuravy · 2022-06-13T20:52:14Z

One of them is a OOM, and the other one feels like a OOM as well https://build.julialang.org/#/builders/72/builds/7343
I retriggered the latter one.

staticfloat · 2022-06-13T22:11:38Z

I'm not sure why we would be OOM'ing, the machines have 32GB of memory available.

Looking at the memory graphs of win64bot1 and win32bot3, I see the following:

I'm not trying hard to attribute one particular dip with another here; just showing that overall, while we are using significant amounts of memory, we aren't in OOM territory yet. I'm willing to bet that the win32 OOM is an address space exhaustion more than an OOM (similar to what we've been seeing elsewhere on linux32) and that win64 is something else entirely.

vchuravy · 2022-06-13T22:33:57Z

sigh Thanks Elliot for checking.

vchuravy · 2022-06-28T22:29:03Z

Can't reproduce the windows failure locally :/

t-bltg · 2022-06-29T14:26:01Z

Fixes #45433, thanks.

t-bltg · 2022-06-29T17:21:56Z

~~Here is the PR backported to 1.6.6, 1.7.3 and 1.8.0-rc1.~~ I've rebuilt all three julia versions (+ master) and all fp16 related tests are passing locally on ubuntu 22.04 (libgcc 12).

TODO: update these with latest changes, see below.

1.6.6

--- src/APInt-C.cpp  2022-06-29 15:37:58.943951000 +0000
+++ src/APInt-C.cpp  2022-06-29 15:39:56.742904521 +0000
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 15:37:58.943951000 +0000
+++ src/aotcompile.cpp  2022-06-29 16:27:31.074065990 +0000
@@ -51,6 +51,7 @@
 #include <llvm/Support/CodeGen.h>
 #endif
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
 
@@ -276,6 +277,24 @@
     *ci_out = codeinst;
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    // Weak so that this does not get discarded
+    // maybe use llvm.compiler.used instead?
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup, and can
 // also be used be extern consumers like GPUCompiler.jl to obtain a module containing
@@ -554,6 +573,20 @@
                                      "jl_RTLD_DEFAULT_handle_pointer"));
     }
 
+    // We would like to emit an alias or an weakref alias to redirect these symbols
+    // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+    // So for now we inject a definition of these functions that calls our runtime functions.
+    injectCRTAlias(*data->M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncdfhf2", "julia__truncdfhf2",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
         PM.run(M);

--- src/julia.expmap  2022-06-29 15:37:58.987952000 +0000
+++ src/julia.expmap  2022-06-29 15:40:28.643715568 +0000
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 15:37:58.991953000 +0000
+++ src/julia_internal.h  2022-06-29 15:42:47.155284019 +0000
@@ -1363,8 +1363,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 15:37:59.003953000 +0000
+++ src/runtime_intrinsics.c  2022-06-29 15:43:48.056873802 +0000
@@ -169,9 +169,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -215,11 +215,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -243,11 +243,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -268,8 +268,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -309,12 +309,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -832,7 +832,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 #define fpislt_n(c_type, nbits)                                         \
     static inline int fpislt##nbits(c_type a, c_type b) JL_NOTSAFEPOINT \
@@ -903,7 +903,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 15:37:58.975952000 +0000
+++ src/jitlayers.cpp  2022-06-29 15:45:50.344097088 +0000
@@ -737,12 +737,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 15:37:58.975952000 +0000
+++ src/jitlayers.h  2022-06-29 15:46:24.985016703 +0000
@@ -185,6 +185,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 16:28:06.923128000 +0000
+++ src/intrinsics.cpp  2022-06-29 16:30:30.343357962 +0000
@@ -1476,22 +1476,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     return float_to_half((float)param);
 }

--- test/intrinsics.jl  2022-06-29 15:37:59.139956000 +0000
+++ test/intrinsics.jl  2022-06-29 15:49:07.285356548 +0000
@@ -152,3 +152,27 @@
     @test_intrinsic Core.Intrinsics.fptosi Int Float16(3.3) 3
     @test_intrinsic Core.Intrinsics.fptoui UInt Float16(3.3) UInt(3)
 end
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

1.7.3

--- src/APInt-C.cpp  2022-06-29 15:38:07.412161000 +0000
+++ src/APInt-C.cpp  2022-06-29 16:03:07.396275264 +0000
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 15:38:07.416161000 +0000
+++ src/aotcompile.cpp  2022-06-29 16:36:32.101927553 +0000
@@ -50,6 +50,7 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
 
@@ -446,6 +447,24 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    // Weak so that this does not get discarded
+    // maybe use llvm.compiler.used instead?
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -551,6 +570,20 @@
                                      "jl_RTLD_DEFAULT_handle_pointer"));
     }
 
+    // We would like to emit an alias or an weakref alias to redirect these symbols
+    // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+    // So for now we inject a definition of these functions that calls our runtime functions.
+    injectCRTAlias(*data->M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncdfhf2", "julia__truncdfhf2",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
         PM.run(M);

--- src/julia.expmap  2022-06-29 15:38:07.444162000 +0000
+++ src/julia.expmap  2022-06-29 16:02:38.471479518 +0000
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 15:38:07.448162000 +0000
+++ src/julia_internal.h  2022-06-29 16:03:58.453680503 +0000
@@ -1427,8 +1427,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 15:38:07.456162000 +0000
+++ src/runtime_intrinsics.c  2022-06-29 16:05:46.116645907 +0000
@@ -338,9 +338,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -384,11 +384,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -412,11 +412,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -437,8 +437,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -478,12 +478,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -1001,7 +1001,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 bool_fintrinsic(eq,eq_float)
 bool_fintrinsic(ne,ne_float)
@@ -1050,7 +1050,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.cpp  2022-06-29 16:38:19.841056942 +0000
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.h  2022-06-29 16:08:04.044478978 +0000
@@ -182,6 +182,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 16:26:53.104938000 +0000
+++ src/intrinsics.cpp  2022-06-29 16:31:32.729189496 +0000
@@ -1635,22 +1635,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     float res = (float)param;
     uint32_t resi;

--- test/intrinsics.jl  2022-06-29 15:38:07.584165000 +0000
+++ test/intrinsics.jl  2022-06-29 16:56:50.640396691 +0000
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

1.8.0-rc1

--- src/aotcompile.cpp  2022-06-29 15:38:07.416161000 +0000
+++ src/aotcompile.cpp  2022-06-29 16:36:32.101927553 +0000
@@ -50,6 +50,7 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
 
@@ -446,6 +447,24 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    // Weak so that this does not get discarded
+    // maybe use llvm.compiler.used instead?
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -551,6 +570,20 @@
                                      "jl_RTLD_DEFAULT_handle_pointer"));
     }
 
+    // We would like to emit an alias or an weakref alias to redirect these symbols
+    // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+    // So for now we inject a definition of these functions that calls our runtime functions.
+    injectCRTAlias(*data->M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+            FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+    injectCRTAlias(*data->M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+    injectCRTAlias(*data->M, "__truncdfhf2", "julia__truncdfhf2",
+            FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
         PM.run(M);

--- src/jitlayers.cpp  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.cpp  2022-06-29 16:38:19.841056942 +0000
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h 2022-06-29 18:41:05.689863399 +0200
+++ src/jitlayers.h  2022-06-29 18:45:27.071795560 +0200
@@ -204,6 +204,7 @@
     void RegisterJITEventListener(JITEventListener *L);
 #endif
 
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);

--- test/intrinsics.jl  2022-06-29 15:38:07.584165000 +0000
+++ test/intrinsics.jl  2022-06-29 16:56:50.640396691 +0000
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`) that's used to make names unique. This list should be reset when the object writer is reset, because otherwise reuse of the object writer can result in freed symbols being accessed. With some added output, this becomes clear when using `llc` in `--run-twice` mode: ``` $ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj DefineSymbol::WeakDefaults - .weak.foo.default - .weak.bar.default DefineSymbol::WeakDefaults - .weak.foo.default - áÑJÄ³⌂ p§┼Ø┐☺ - .debug_macinfo.dw - .weak.bar.default ``` This does not seem to leak into the output object file though, so I couldn't come up with a test. I added one that just does `--run-twice` (and verified that it does access freed memory), which should result in detecting the invalid memory accesses when running under ASAN. Observed in a Julia PR where we started using weak symbols: JuliaLang/julia#45649 Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D129840

maleadt · 2022-07-19T12:34:38Z

CI failures:

freebsd: fcntl(): Bad file descriptor, also happens on master

linux x64: also happens on other PRs

Pkg                                               (5) |         failed at 2022-07-19T11:41:06.090
Test Failed at /cache/build/default-amdci5-4/julialang/julia-master/julia-ab30809523/share/julia/stdlib/v1.9/Pkg/test/registry.jl:359
Expression: isfile(joinpath(DEPOT_PATH[1], "registries", "General.tar.gz")) != something(unpack, false)
Evaluated: false != false
Test Failed at /cache/build/default-amdci5-4/julialang/julia-master/julia-ab30809523/share/julia/stdlib/v1.9/Pkg/test/registry.jl:378
Expression: isempty(readdir(joinpath(DEPOT_PATH[1], "registries")))
Evaluated: isempty(["General"])

macos x64: unlikely to be related, and didn't occur on retry

InteractiveUtils                                 (13) |         failed at 2022-07-19T06:27:39.462
ProcessExitedException(13)

Crucially both windows bots are happy 🎉 So this looks good to go for me.

vchuravy · 2022-07-19T16:23:40Z

Thanks Tim for getting this across the finish line! LGTM!

- Put the interposer in llvm.compiler.used. - Injecting the aliases after optimization: Our multiversioning pass interacts badly with the llvm.compiler.used gvar. Co-authored-by: Tim Besard <tim.besard@gmail.com> Co-authored-by: Valentin Churavy <v.churavy@gmail.com>

vtjnash

LGTM

t-bltg · 2022-07-22T08:28:18Z

@KristofferC, if you need it, here is a manual backport of the merged PR for all concerned versions, with tests passing:

1.6.7

--- src/APInt-C.cpp  2022-06-29 17:37:58.943951000 +0200
+++ src/APInt-C.cpp  2022-06-29 17:39:56.742904521 +0200
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 17:37:58.943951000 +0200
+++ src/aotcompile.cpp  2022-07-22 10:09:59.465318017 +0200
@@ -51,8 +51,10 @@
 #include <llvm/Support/CodeGen.h>
 #endif
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
+#include <llvm/Transforms/Utils/ModuleUtils.h>
 
 
 using namespace llvm;
@@ -276,6 +278,23 @@
     *ci_out = codeinst;
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+    appendToCompilerUsed(M, {interposer});
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup, and can
 // also be used be extern consumers like GPUCompiler.jl to obtain a module containing
@@ -556,7 +575,22 @@
 
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
+        // We would like to emit an alias or an weakref alias to redirect these symbols
+        // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+        // So for now we inject a definition of these functions that calls our runtime functions.
+        injectCRTAlias(M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncdfhf2", "julia__truncdfhf2",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
         PM.run(M);
+
         if (unopt_bc_fname)
             emit_result(unopt_bc_Archive, unopt_bc_Buffer, unopt_bc_Name, outputs);
         if (bc_fname)

--- src/julia.expmap  2022-06-29 17:37:58.987952000 +0200
+++ src/julia.expmap  2022-06-29 17:40:28.643715568 +0200
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 17:37:58.991953000 +0200
+++ src/julia_internal.h  2022-06-29 17:42:47.155284019 +0200
@@ -1363,8 +1363,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 17:37:59.003953000 +0200
+++ src/runtime_intrinsics.c  2022-07-19 18:37:28.928908192 +0200
@@ -169,9 +169,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -215,11 +215,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -243,11 +243,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -268,8 +268,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -309,12 +309,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -832,7 +832,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 #define fpislt_n(c_type, nbits)                                         \
     static inline int fpislt##nbits(c_type a, c_type b) JL_NOTSAFEPOINT \
@@ -903,7 +903,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 17:37:58.975952000 +0200
+++ src/jitlayers.cpp  2022-06-29 17:45:50.344097088 +0200
@@ -737,12 +737,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 17:37:58.975952000 +0200
+++ src/jitlayers.h  2022-06-29 17:46:24.985016703 +0200
@@ -185,6 +185,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 18:28:06.923128000 +0200
+++ src/intrinsics.cpp  2022-06-29 18:30:30.343357962 +0200
@@ -1476,22 +1476,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     return float_to_half((float)param);
 }

--- test/intrinsics.jl  2022-06-29 17:37:59.139956000 +0200
+++ test/intrinsics.jl  2022-06-29 17:49:07.285356548 +0200
@@ -152,3 +152,27 @@
     @test_intrinsic Core.Intrinsics.fptosi Int Float16(3.3) 3
     @test_intrinsic Core.Intrinsics.fptoui UInt Float16(3.3) UInt(3)
 end
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

1.7.3

--- src/APInt-C.cpp  2022-06-29 17:38:07.412161000 +0200
+++ src/APInt-C.cpp  2022-06-29 18:03:07.396275264 +0200
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 17:38:07.416161000 +0200
+++ src/aotcompile.cpp  2022-07-22 10:08:25.371800696 +0200
@@ -50,8 +50,10 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
+#include <llvm/Transforms/Utils/ModuleUtils.h>
 
 
 using namespace llvm;
@@ -446,6 +448,23 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+    appendToCompilerUsed(M, {interposer});
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -553,7 +572,22 @@
 
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
+        // We would like to emit an alias or an weakref alias to redirect these symbols
+        // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+        // So for now we inject a definition of these functions that calls our runtime functions.
+        injectCRTAlias(M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncdfhf2", "julia__truncdfhf2",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
         PM.run(M);
+
         if (unopt_bc_fname)
             emit_result(unopt_bc_Archive, unopt_bc_Buffer, unopt_bc_Name, outputs);
         if (bc_fname)

--- src/julia.expmap  2022-06-29 17:38:07.444162000 +0200
+++ src/julia.expmap  2022-06-29 18:02:38.471479518 +0200
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 17:38:07.448162000 +0200
+++ src/julia_internal.h  2022-06-29 18:03:58.453680503 +0200
@@ -1427,8 +1427,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 17:38:07.456162000 +0200
+++ src/runtime_intrinsics.c  2022-06-29 18:05:46.116645907 +0200
@@ -338,9 +338,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -384,11 +384,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -412,11 +412,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -437,8 +437,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -478,12 +478,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -1001,7 +1001,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 bool_fintrinsic(eq,eq_float)
 bool_fintrinsic(ne,ne_float)
@@ -1050,7 +1050,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 17:38:07.440162000 +0200
+++ src/jitlayers.cpp  2022-06-29 18:38:19.841056942 +0200
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h  2022-06-29 17:38:07.440162000 +0200
+++ src/jitlayers.h  2022-06-29 18:08:04.044478978 +0200
@@ -182,6 +182,7 @@
                          const object::ObjectFile &Obj,
                          const RuntimeDyld::LoadedObjectInfo &LoadedObjectInfo);
 #endif
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);
 #if JL_LLVM_VERSION < 120000

--- src/intrinsics.cpp  2022-06-29 18:26:53.104938000 +0200
+++ src/intrinsics.cpp  2022-06-29 18:31:32.729189496 +0200
@@ -1635,22 +1635,17 @@
 
 #if !defined(_OS_DARWIN_)   // xcode already links compiler-rt
 
-extern "C" JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+extern "C" JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-extern "C" JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-extern "C" JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+extern "C" JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-extern "C" JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+extern "C" JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     float res = (float)param;
     uint32_t resi;

--- test/intrinsics.jl  2022-06-29 17:38:07.584165000 +0200
+++ test/intrinsics.jl  2022-06-29 18:56:50.640396691 +0200
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

1.8.0-rc3

--- src/APInt-C.cpp  2022-06-29 15:38:07.412161000 +0000
+++ src/APInt-C.cpp  2022-06-29 16:03:07.396275264 +0000
@@ -316,7 +316,7 @@
 void LLVMFPtoInt(unsigned numbits, void *pa, unsigned onumbits, integerPart *pr, bool isSigned, bool *isExact) {
     double Val;
     if (numbits == 16)
-        Val = __gnu_h2f_ieee(*(uint16_t*)pa);
+        Val = julia__gnu_h2f_ieee(*(uint16_t*)pa);
     else if (numbits == 32)
         Val = *(float*)pa;
     else if (numbits == 64)
@@ -391,7 +391,7 @@
         val = a.roundToDouble(true);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)
@@ -408,7 +408,7 @@
         val = a.roundToDouble(false);
     }
     if (onumbits == 16)
-        *(uint16_t*)pr = __gnu_f2h_ieee(val);
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(val);
     else if (onumbits == 32)
         *(float*)pr = val;
     else if (onumbits == 64)

--- src/aotcompile.cpp  2022-06-29 15:38:07.416161000 +0000
+++ src/aotcompile.cpp  2022-07-19 16:43:52.586543207 +0000
@@ -50,8 +50,10 @@
 #include <llvm/MC/MCCodeEmitter.h>
 #include <llvm/Support/CodeGen.h>
 
+#include <llvm/IR/IRBuilder.h>
 #include <llvm/IR/LegacyPassManagers.h>
 #include <llvm/Transforms/Utils/Cloning.h>
+#include <llvm/Transforms/Utils/ModuleUtils.h>
 
 
 using namespace llvm;
@@ -446,6 +448,23 @@
     jl_safe_printf("ERROR: failed to emit output file %s\n", err.c_str());
 }
 
+static void injectCRTAlias(Module &M, StringRef name, StringRef alias, FunctionType *FT)
+{
+    Function *target = M.getFunction(alias);
+    if (!target) {
+        target = Function::Create(FT, Function::ExternalLinkage, alias, M);
+    }
+    Function *interposer = Function::Create(FT, Function::WeakAnyLinkage, name, M);
+    appendToCompilerUsed(M, {interposer});
+
+    llvm::IRBuilder<> builder(BasicBlock::Create(M.getContext(), "top", interposer));
+    SmallVector<Value *, 4> CallArgs;
+    for (auto &arg : interposer->args())
+        CallArgs.push_back(&arg);
+    auto val = builder.CreateCall(target, CallArgs);
+    builder.CreateRet(val);
+}
+
 
 // takes the running content that has collected in the shadow module and dump it to disk
 // this builds the object file portion of the sysimage files for fast startup
@@ -553,7 +572,22 @@
 
     // do the actual work
     auto add_output = [&] (Module &M, StringRef unopt_bc_Name, StringRef bc_Name, StringRef obj_Name, StringRef asm_Name) {
+        // We would like to emit an alias or an weakref alias to redirect these symbols
+        // but LLVM doesn't let us emit a GlobalAlias to a declaration...
+        // So for now we inject a definition of these functions that calls our runtime functions.
+        injectCRTAlias(M, "__gnu_h2f_ieee", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__extendhfsf2", "julia__gnu_h2f_ieee",
+                FunctionType::get(Type::getFloatTy(Context), { Type::getHalfTy(Context) }, false));
+        injectCRTAlias(M, "__gnu_f2h_ieee", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncsfhf2", "julia__gnu_f2h_ieee",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getFloatTy(Context) }, false));
+        injectCRTAlias(M, "__truncdfhf2", "julia__truncdfhf2",
+                FunctionType::get(Type::getHalfTy(Context), { Type::getDoubleTy(Context) }, false));
+
         PM.run(M);
+
         if (unopt_bc_fname)
             emit_result(unopt_bc_Archive, unopt_bc_Buffer, unopt_bc_Name, outputs);
         if (bc_fname)

--- src/julia.expmap  2022-06-29 15:38:07.444162000 +0000
+++ src/julia.expmap  2022-06-29 16:02:38.471479518 +0000
@@ -42,12 +42,6 @@
     environ;
     __progname;
 
-    /* compiler run-time intrinsics */
-    __gnu_h2f_ieee;
-    __extendhfsf2;
-    __gnu_f2h_ieee;
-    __truncdfhf2;
-
   local:
     *;
 };

--- src/julia_internal.h  2022-06-29 15:38:07.448162000 +0000
+++ src/julia_internal.h  2022-06-29 16:03:58.453680503 +0000
@@ -1427,8 +1427,9 @@
   #define JL_GC_ASSERT_LIVE(x) (void)(x)
 #endif
 
-float __gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
-uint16_t __gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param) JL_NOTSAFEPOINT;
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param) JL_NOTSAFEPOINT;
 
 #ifdef __cplusplus
 }

--- src/runtime_intrinsics.c  2022-06-29 15:38:07.456162000 +0000
+++ src/runtime_intrinsics.c  2022-06-29 16:05:46.116645907 +0000
@@ -188,22 +188,17 @@
     return h;
 }
 
-JL_DLLEXPORT float __gnu_h2f_ieee(uint16_t param)
+JL_DLLEXPORT float julia__gnu_h2f_ieee(uint16_t param)
 {
     return half_to_float(param);
 }
 
-JL_DLLEXPORT float __extendhfsf2(uint16_t param)
-{
-    return half_to_float(param);
-}
-
-JL_DLLEXPORT uint16_t __gnu_f2h_ieee(float param)
+JL_DLLEXPORT uint16_t julia__gnu_f2h_ieee(float param)
 {
     return float_to_half(param);
 }
 
-JL_DLLEXPORT uint16_t __truncdfhf2(double param)
+JL_DLLEXPORT uint16_t julia__truncdfhf2(double param)
 {
     float res = (float)param;
     uint32_t resi;
@@ -338,9 +338,9 @@
     }
 
 #define fp_select(a, func) \
-    sizeof(a) == sizeof(float) ? func##f((float)a) : func(a)
+    sizeof(a) <= sizeof(float) ? func##f((float)a) : func(a)
 #define fp_select2(a, b, func) \
-    sizeof(a) == sizeof(float) ? func##f(a, b) : func(a, b)
+    sizeof(a) <= sizeof(float) ? func##f(a, b) : func(a, b)
 
 // fast-function generators //
 
@@ -384,11 +384,11 @@
 static inline void name(unsigned osize, void *pa, void *pr) JL_NOTSAFEPOINT \
 { \
     uint16_t a = *(uint16_t*)pa; \
-    float A = __gnu_h2f_ieee(a); \
+    float A = julia__gnu_h2f_ieee(a); \
     if (osize == 16) { \
         float R; \
         OP(&R, A); \
-        *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+        *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
     } else { \
         OP((uint16_t*)pr, A); \
     } \
@@ -412,11 +412,11 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     float R = OP(A, B); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 // float or integer inputs, bool output
@@ -437,8 +437,8 @@
 { \
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
     runtime_nbits = 16; \
     return OP(A, B); \
 }
@@ -478,12 +478,12 @@
     uint16_t a = *(uint16_t*)pa; \
     uint16_t b = *(uint16_t*)pb; \
     uint16_t c = *(uint16_t*)pc; \
-    float A = __gnu_h2f_ieee(a); \
-    float B = __gnu_h2f_ieee(b); \
-    float C = __gnu_h2f_ieee(c); \
+    float A = julia__gnu_h2f_ieee(a); \
+    float B = julia__gnu_h2f_ieee(b); \
+    float C = julia__gnu_h2f_ieee(c); \
     runtime_nbits = 16; \
     float R = OP(A, B, C); \
-    *(uint16_t*)pr = __gnu_f2h_ieee(R); \
+    *(uint16_t*)pr = julia__gnu_f2h_ieee(R); \
 }
 
 
@@ -1001,7 +1001,7 @@
 fpiseq_n(float, 32)
 fpiseq_n(double, 64)
 #define fpiseq(a,b) \
-    sizeof(a) == sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
+    sizeof(a) <= sizeof(float) ? fpiseq32(a, b) : fpiseq64(a, b)
 
 bool_fintrinsic(eq,eq_float)
 bool_fintrinsic(ne,ne_float)
@@ -1050,7 +1050,7 @@
         if (!(osize < 8 * sizeof(a))) \
             jl_error("fptrunc: output bitsize must be < input bitsize"); \
         else if (osize == 16) \
-            *(uint16_t*)pr = __gnu_f2h_ieee(a); \
+            *(uint16_t*)pr = julia__gnu_f2h_ieee(a); \
         else if (osize == 32) \
             *(float*)pr = a; \
         else if (osize == 64) \

--- src/jitlayers.cpp  2022-06-29 15:38:07.440162000 +0000
+++ src/jitlayers.cpp  2022-06-29 16:38:19.841056942 +0000
@@ -728,12 +728,26 @@
     }
 
     JD.addToLinkOrder(GlobalJD, orc::JITDylibLookupFlags::MatchExportedSymbolsOnly);
+
+    orc::SymbolAliasMap jl_crt = {
+        { mangle("__gnu_h2f_ieee"), { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__extendhfsf2"),  { mangle("julia__gnu_h2f_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__gnu_f2h_ieee"), { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncsfhf2"),   { mangle("julia__gnu_f2h_ieee"), JITSymbolFlags::Exported } },
+        { mangle("__truncdfhf2"),   { mangle("julia__truncdfhf2"),   JITSymbolFlags::Exported } }
+    };
+    cantFail(GlobalJD.define(orc::symbolAliases(jl_crt)));
 }
 
-void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+orc::SymbolStringPtr JuliaOJIT::mangle(StringRef Name)
 {
     std::string MangleName = getMangledName(Name);
-    cantFail(JD.define(orc::absoluteSymbols({{ES.intern(MangleName), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
+    return ES.intern(MangleName);
+}
+
+void JuliaOJIT::addGlobalMapping(StringRef Name, uint64_t Addr)
+{
+    cantFail(JD.define(orc::absoluteSymbols({{mangle(Name), JITEvaluatedSymbol::fromPointer((void*)Addr)}})));
 }
 
 void JuliaOJIT::addModule(std::unique_ptr<Module> M)

--- src/jitlayers.h 2022-06-29 18:41:05.689863399 +0200
+++ src/jitlayers.h  2022-06-29 18:45:27.071795560 +0200
@@ -204,6 +204,7 @@
     void RegisterJITEventListener(JITEventListener *L);
 #endif
 
+    orc::SymbolStringPtr mangle(StringRef Name);
     void addGlobalMapping(StringRef Name, uint64_t Addr);
     void addModule(std::unique_ptr<Module> M);

--- test/intrinsics.jl  2022-06-29 15:38:07.584165000 +0000
+++ test/intrinsics.jl  2022-06-29 16:56:50.640396691 +0000
@@ -284,3 +284,27 @@
         @test r2 isa IntWrap && r2.x === 103 === r[].x && r2 !== r[]
     end
 end)()
+
+if Sys.ARCH == :aarch64
+    # On AArch64 we are following the `_Float16` ABI. Buthe these functions expect `Int16`.
+    # TODO: SHould we have `Chalf == Int16` and `Cfloat16 == Float16`?
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Int16,), reinterpret(Int16, x))
+    truncsfhf2(x::Float32) = reinterpret(Float16, ccall("extern __truncsfhf2", llvmcall, Int16, (Float32,), x))
+    gnu_f2h_ieee(x::Float32) = reinterpret(Float16, ccall("extern __gnu_f2h_ieee", llvmcall, Int16, (Float32,), x))
+    truncdfhf2(x::Float64) = reinterpret(Float16, ccall("extern __truncdfhf2", llvmcall, Int16, (Float64,), x))
+else
+    extendhfsf2(x::Float16) = ccall("extern __extendhfsf2", llvmcall, Float32, (Float16,), x)
+    gnu_h2f_ieee(x::Float16) = ccall("extern __gnu_h2f_ieee", llvmcall, Float32, (Float16,), x)
+    truncsfhf2(x::Float32) = ccall("extern __truncsfhf2", llvmcall, Float16, (Float32,), x)
+    gnu_f2h_ieee(x::Float32) = ccall("extern __gnu_f2h_ieee", llvmcall, Float16, (Float32,), x)
+    truncdfhf2(x::Float64) = ccall("extern __truncdfhf2", llvmcall, Float16, (Float64,), x)
+end
+
+@testset "Float16 intrinsics (crt)" begin
+    @test extendhfsf2(Float16(3.3)) == 3.3007812f0
+    @test gnu_h2f_ieee(Float16(3.3)) == 3.3007812f0
+    @test truncsfhf2(3.3f0) == Float16(3.3)
+    @test gnu_f2h_ieee(3.3f0) == Float16(3.3)
+    @test truncdfhf2(3.3) == Float16(3.3)
+end

~~Or should I open a PR for each backport ?~~

EDIT: I missed #46110, sorry.

maleadt · 2022-07-22T09:33:20Z

There's already a backport PR for 1.8: #46110. I didn't think we're going to backport this to 1.7 or 1.6 though? It would also need a backport of https://reviews.llvm.org/D129840 to all relevant LLVM branches, which hasn't happened yet.

Backport "Emit aliases for FP16 conversion routines" (#45649) to 1.8

When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`) that's used to make names unique. This list should be reset when the object writer is reset, because otherwise reuse of the object writer can result in freed symbols being accessed. With some added output, this becomes clear when using `llc` in `--run-twice` mode: ``` $ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj DefineSymbol::WeakDefaults - .weak.foo.default - .weak.bar.default DefineSymbol::WeakDefaults - .weak.foo.default - áÑJÄ³⌂ p§┼Ø┐☺ - .debug_macinfo.dw - .weak.bar.default ``` This does not seem to leak into the output object file though, so I couldn't come up with a test. I added one that just does `--run-twice` (and verified that it does access freed memory), which should result in detecting the invalid memory accesses when running under ASAN. Observed in a Julia PR where we started using weak symbols: JuliaLang/julia#45649 Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D129840

When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`) that's used to make names unique. This list should be reset when the object writer is reset, because otherwise reuse of the object writer can result in freed symbols being accessed. With some added output, this becomes clear when using `llc` in `--run-twice` mode: ``` $ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj DefineSymbol::WeakDefaults - .weak.foo.default - .weak.bar.default DefineSymbol::WeakDefaults - .weak.foo.default - áÑJÄ³⌂ p§┼Ø┐☺ - .debug_macinfo.dw - .weak.bar.default ``` This does not seem to leak into the output object file though, so I couldn't come up with a test. I added one that just does `--run-twice` (and verified that it does access freed memory), which should result in detecting the invalid memory accesses when running under ASAN. Observed in a Julia PR where we started using weak symbols: JuliaLang/julia#45649 Differential Revision: https://reviews.llvm.org/D129840

When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`) that's used to make names unique. This list should be reset when the object writer is reset, because otherwise reuse of the object writer can result in freed symbols being accessed. With some added output, this becomes clear when using `llc` in `--run-twice` mode: ``` $ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj DefineSymbol::WeakDefaults - .weak.foo.default - .weak.bar.default DefineSymbol::WeakDefaults - .weak.foo.default - áÑJÄ³⌂ p§┼Ø┐☺ - .debug_macinfo.dw - .weak.bar.default ``` This does not seem to leak into the output object file though, so I couldn't come up with a test. I added one that just does `--run-twice` (and verified that it does access freed memory), which should result in detecting the invalid memory accesses when running under ASAN. Observed in a Julia PR where we started using weak symbols: JuliaLang/julia#45649 Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D129840

staticfloat · 2022-12-24T22:06:25Z

This was marked for backporting onto 1.6, but we ran into errors because, as Tim said, the LLVM backport has not happened. If someone wants this backported to 1.6, they will need to get a proper release-1.6 LLVM bump going with the backported fixes (as Tim pointed out).

vchuravy added this to the 1.8 milestone Jun 11, 2022

vchuravy added backport 1.8 Change should be backported to release-1.8 float16 labels Jun 11, 2022

vchuravy requested review from maleadt, pchintalapudi and vtjnash June 11, 2022 21:30

vchuravy mentioned this pull request Jun 11, 2022

release-1.8: Revert "codegen: explicitly handle Float16 intrinsics (#45249)" #45627

Merged

vchuravy commented Jun 12, 2022

View reviewed changes

src/jitlayers.cpp Outdated Show resolved Hide resolved

Base automatically changed from revert-45249-jn/44829c to master June 12, 2022 20:45

vchuravy marked this pull request as ready for review June 12, 2022 20:46

vchuravy force-pushed the vc/fp16 branch 2 times, most recently from aed6adf to ab1aa83 Compare June 12, 2022 21:46

vchuravy force-pushed the vc/fp16 branch from 8652f17 to 77dce39 Compare June 13, 2022 13:16

vchuravy mentioned this pull request Jun 26, 2022

Float16 test failures when building with GCC 12 and USE_SYSTEM_CSL=0 USE_BINARYBUILDER_CSL=0 #44829

Closed

vchuravy force-pushed the vc/fp16 branch from 77dce39 to fe411e4 Compare June 26, 2022 16:06

t-bltg mentioned this pull request Jun 29, 2022

Build: failing Float16 intrinsics on 1.6.6 or 1.7.2 on ubuntu 22.04 #45433

Closed

giordano mentioned this pull request Jun 29, 2022

[CompilerSupportLibraries_jll] Upgrade to libraries from GCC 12 #45582

Merged

vchuravy added backport 1.6 Change should be backported to release-1.6 backport 1.7 labels Jun 30, 2022

vchuravy force-pushed the vc/fp16 branch from 7c3bd1f to a22d538 Compare July 3, 2022 17:30

maleadt force-pushed the vc/fp16 branch from c73f81b to 72113cb Compare July 16, 2022 18:34

maleadt changed the base branch from master to tb/llvm July 16, 2022 18:36

KristofferC mentioned this pull request Jul 17, 2022

release-1.8: Backports for 1.8-rc4 #46075

Merged

32 tasks

Base automatically changed from tb/llvm to master July 17, 2022 18:54

maleadt mentioned this pull request Jul 19, 2022

Update LLVM to include additional patches. #46091

Merged

maleadt force-pushed the vc/fp16 branch from 72113cb to ab30809 Compare July 19, 2022 09:39

maleadt approved these changes Jul 19, 2022

View reviewed changes

vchuravy removed the DO NOT MERGE Do not merge this PR! label Jul 19, 2022

vtjnash and others added 3 commits July 19, 2022 12:43

Prefix Float16 intrinsics

f651866

Define aliases to FP16 crt in the OJIT

ff36015

vchuravy force-pushed the vc/fp16 branch from ab30809 to 3407fb3 Compare July 19, 2022 16:45

vchuravy merged commit adf2e1b into master Jul 20, 2022

vchuravy deleted the vc/fp16 branch July 20, 2022 00:16

KristofferC mentioned this pull request Jul 20, 2022

release-1.6: Backports for Julia 1.6.8 #46116

Closed

78 tasks

vtjnash reviewed Jul 21, 2022

View reviewed changes

vchuravy added a commit that referenced this pull request Aug 2, 2022

Merge pull request #46110 from JuliaLang/vc/fp16_bb

4983135

Backport "Emit aliases for FP16 conversion routines" (#45649) to 1.8

KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Aug 7, 2022

giordano mentioned this pull request Dec 21, 2022

[CompilerSupportLibraries_jll] Update to new build #47444

Merged

vchuravy mentioned this pull request Dec 26, 2022

Implement support for object caching through pkgimages #47184

Merged

5 tasks

giordano mentioned this pull request Mar 8, 2023

julia: patch for gcc12 support spack/spack#35931

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit aliases to FP16 conversion routines #45649

Emit aliases to FP16 conversion routines #45649

vchuravy commented Jun 11, 2022

vchuravy commented Jun 12, 2022

vchuravy commented Jun 12, 2022

vchuravy commented Jun 13, 2022

KristofferC commented Jun 13, 2022

vchuravy commented Jun 13, 2022

staticfloat commented Jun 13, 2022

vchuravy commented Jun 13, 2022

vchuravy commented Jun 28, 2022

t-bltg commented Jun 29, 2022

t-bltg commented Jun 29, 2022 •

edited

Loading

maleadt commented Jul 19, 2022 •

edited

Loading

vchuravy commented Jul 19, 2022 •

edited

Loading

vtjnash left a comment

t-bltg commented Jul 22, 2022 •

edited

Loading

maleadt commented Jul 22, 2022

staticfloat commented Dec 24, 2022

Emit aliases to FP16 conversion routines #45649

Emit aliases to FP16 conversion routines #45649

Conversation

vchuravy commented Jun 11, 2022

vchuravy commented Jun 12, 2022

vchuravy commented Jun 12, 2022

vchuravy commented Jun 13, 2022

KristofferC commented Jun 13, 2022

vchuravy commented Jun 13, 2022

staticfloat commented Jun 13, 2022

vchuravy commented Jun 13, 2022

vchuravy commented Jun 28, 2022

t-bltg commented Jun 29, 2022

t-bltg commented Jun 29, 2022 • edited Loading

maleadt commented Jul 19, 2022 • edited Loading

vchuravy commented Jul 19, 2022 • edited Loading

vtjnash left a comment

Choose a reason for hiding this comment

t-bltg commented Jul 22, 2022 • edited Loading

maleadt commented Jul 22, 2022

staticfloat commented Dec 24, 2022

t-bltg commented Jun 29, 2022 •

edited

Loading

maleadt commented Jul 19, 2022 •

edited

Loading

vchuravy commented Jul 19, 2022 •

edited

Loading

t-bltg commented Jul 22, 2022 •

edited

Loading