[PPC/Pwr8] Half-precision conversion from BigInt #39330

vchuravy · 2021-01-20T03:10:40Z

Conversion from BigInt

gmp                                (78) |         failed at 2021-01-19T21:39:56.822
Test Failed at /nobackup/users/vchuravy/dev/julia/test/gmp.jl:501
  Expression: T(big"2" ^ (n + 1) - big"2" ^ (n - precision(T))) === T(Inf)
   Evaluated: -Inf16 === Inf16

On Power9 `julia -C pwr9`

julia> T=Float16; n = exponent(floatmax(T))
15

julia> T(big"2" ^ (n + 1) - big"2" ^ (n - precision(T)))
Inf16

On Power9 `julia -C pwr8`

julia> T=Float16; n = exponent(floatmax(T))
15

julia> T(big"2" ^ (n + 1) - big"2" ^ (n - precision(T)))
-Inf16

Gist: https://gist.github.com/vchuravy/a0170b4d42e46d800aae53b3a80bb582

Duplicate of #38896 which I closed since fixing #38883 made it harder to reproduce until #39300

Upstream: https://bugs.llvm.org/show_bug.cgi?id=49092

The text was updated successfully, but these errors were encountered:

vchuravy · 2021-01-20T04:56:06Z

Reduce to:

source_filename = "g"
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

declare i64 @llvm.ctlz.i64(i64, i1 immarg) #1

define half @julia_g_219(i64 zeroext %0, i64 signext %1, i32 signext %2) {
top:
  %3 = call i64 @llvm.ctlz.i64(i64 %0, i1 false)
  %4 = icmp ugt i64 %0, 65535
  %5 = icmp slt i32 %2, 0
  %6 = sub i32 0, %2
  %7 = select i1 %5, i32 %6, i32 %2
  %8 = icmp sgt i32 %7, 1
  %value_phi.in = or i1 %4, %8
  br i1 %value_phi.in, label %L48, label %L14

L14:                                              ; preds = %top
  %9 = sub nsw i64 52, %3
  %10 = icmp ult i64 %0, 2048
  %11 = lshr i64 %0, %9
  %12 = icmp ugt i64 %9, 63
  %13 = select i1 %12, i64 0, i64 %11
  %14 = sub nsw i64 0, %9
  %15 = shl i64 %0, %14
  %16 = icmp ugt i64 %14, 63
  %17 = select i1 %16, i64 0, i64 %15
  %18 = select i1 %10, i64 %17, i64 %13
  %19 = trunc i64 %18 to i16
  %20 = add i16 %19, 1
  %21 = lshr i16 %20, 1
  %22 = icmp eq i64 %9, %1
  %23 = zext i1 %22 to i16
  %24 = xor i16 %23, -1
  %25 = and i16 %21, %24
  %26 = trunc i64 %3 to i16
  %27 = shl i16 %26, 10
  %28 = sub i16 13312, %27
  %29 = add i16 %28, %25
  %30 = bitcast i16 %29 to half
  br label %L48

L48:                                              ; preds = %L14, %top
  %value_phi1 = phi half [ %30, %L14 ], [ 0xH7C00, %top ]
  %31 = icmp sgt i32 %2, -1
  %32 = fpext half %value_phi1 to float
  %33 = fneg float %32
  %34 = fptrunc float %33 to half
  %35 = select i1 %31, half %value_phi1, half %34
  ret half %35
}

define i32 @main() {
  %val = call half @julia_g_219(i64 65520, i64 4, i32 1)
  %cmp = fcmp oeq half %val, 0xH7C00
  %r = zext i1 %cmp to i32
  ret i32 %r
} 

attributes #1 = { nounwind readnone speculatable willreturn }

Linking it against out internal __gnu_h2f_ieee and __gnu_f2h_ieee.

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -Wl,-rpath=usr/lib -Lusr/lib -ljulia-internal                                                                                                                     
[vchuravy@service0001 jl-pwr8]$ ./a.out ; echo $?
0

Whereas I expected 1.

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll ../toolchain/lib/clang/11.0.1/lib/linux/libclang_rt.builtins-powerpc64le.a 
[vchuravy@service0001 jl-pwr8]$ ./a.out ; echo $?
1

Extracting our implementation of __gnu_h2f_ieee and linking it against that also has the intended result...

@staticfloat does ../toolchain/tools/clang -mcpu=pwr8 main.ll -Wl,-rpath=usr/lib -Lusr/lib -ljulia-internal look kosher? Or am I linking against the trampoline by accident?

@maleadt any ideas? We could think about loading compiler-rt with Orc

maleadt · 2021-01-20T07:08:36Z

We could think about loading compiler-rt with Orc

If linking differently makes our implementation of the intrinsics work, that would just paper over the actual issue, right?
Our current implementation of these intrinsics matches what we used to have in float.jl, and doesn't exactly match what lives in compiler-rt:

julia/src/intrinsics.cpp

Line 1366 in ec386bd

// With adjustments for round-to-nearest, ties to even.

staticfloat · 2021-01-20T07:09:47Z

does ../toolchain/tools/clang -mcpu=pwr8 main.ll -Wl,-rpath=usr/lib -Lusr/lib -ljulia-internal look kosher? Or am I linking against the trampoline by accident?

No, if you link against julia-internal that's correct.

vchuravy · 2021-01-20T15:33:33Z

Yeah looking at the code linked against libjulia-internal...

Run till exit from #0  __gnu_h2f_ieee (param=31744) at /nobackup/users/vchuravy/dev/julia/src/intrinsics.cpp:1488
0x00000000100008f0 in julia_g_219 ()
Value returned is $1 = -inf
Run till exit from #0  __gnu_h2f_ieee (param=31744) at /nobackup/users/vchuravy/dev/julia/src/intrinsics.cpp:1488
0x0000000010000944 in julia_g_219 ()
Value returned is $5 = inf

inf is the right answer.

vchuravy · 2021-01-20T15:54:45Z

The plot thickens, compiling half.c with optimizations:

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -O2 -g2 -mcpu=pwr8 -c -fpic half.c                                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -shared -o libhalf.so half.o                                                                                                                                                         
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf
[vchuravy@service0001 jl-pwr8]$ ./a.out 
[vchuravy@service0001 jl-pwr8]$ echo $?                                                                                                                                                                                                       
0

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -O0 -g2 -mcpu=pwr8 -c -fpic half.c                                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -shared -o libhalf.so half.o                                                                                                                                                         
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ./a.out ; echo $?
1

vchuravy · 2021-01-20T15:57:39Z

Not limited to Clang.

[vchuravy@service0001 jl-pwr8]$ gcc  -O2 -g2 -mcpu=power8 -c -fpic half.c                                                                                                                                                                     
[vchuravy@service0001 jl-pwr8]$ gcc -shared -o libhalf.so half.o                                                                                                                                                                              
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf
[vchuravy@service0001 jl-pwr8]$ ./a.out; echo $?                                                                                                                                                                                              
0

[vchuravy@service0001 jl-pwr8]$ gcc  -O0 -g2 -mcpu=power8 -c -fpic half.c                                                                                                                                                                     
[vchuravy@service0001 jl-pwr8]$ gcc -shared -o libhalf.so half.o                                                                                                                                                                              
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ./a.out; echo $?
1

vchuravy · 2021-02-04T01:14:41Z

Ok we are passing the argument in r3, which on the call that yields the wrong result is r3 0xffff7c00 4294933504 and on a call where it is the right result r3 0x7c00 31744 so the caller either didn't clear the upper bit or sext it.

vchuravy · 2021-02-18T16:12:06Z

Fixed by #39712

vchuravy added system:powerpc PowerPC bignums BigInt and BigFloat float16 labels Jan 20, 2021

vchuravy changed the title ~~[PPC/Pwr8] Half-precision floating operations~~ [PPC/Pwr8] Half-precision conversion from BigInt Jan 20, 2021

vchuravy added the upstream The issue is with an upstream dependency, e.g. LLVM label Feb 8, 2021

vchuravy closed this as completed Feb 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PPC/Pwr8] Half-precision conversion from BigInt #39330

[PPC/Pwr8] Half-precision conversion from BigInt #39330

vchuravy commented Jan 20, 2021 •

edited

Loading

vchuravy commented Jan 20, 2021

maleadt commented Jan 20, 2021

staticfloat commented Jan 20, 2021

vchuravy commented Jan 20, 2021 •

edited

Loading

vchuravy commented Jan 20, 2021

vchuravy commented Jan 20, 2021

vchuravy commented Feb 4, 2021

vchuravy commented Feb 18, 2021

[PPC/Pwr8] Half-precision conversion from BigInt #39330

[PPC/Pwr8] Half-precision conversion from BigInt #39330

Comments

vchuravy commented Jan 20, 2021 • edited Loading

Conversion from BigInt

On Power9 julia -C pwr9

On Power9 julia -C pwr8

vchuravy commented Jan 20, 2021

maleadt commented Jan 20, 2021

staticfloat commented Jan 20, 2021

vchuravy commented Jan 20, 2021 • edited Loading

vchuravy commented Jan 20, 2021

vchuravy commented Jan 20, 2021

vchuravy commented Feb 4, 2021

vchuravy commented Feb 18, 2021

vchuravy commented Jan 20, 2021 •

edited

Loading

On Power9 `julia -C pwr9`

On Power9 `julia -C pwr8`

vchuravy commented Jan 20, 2021 •

edited

Loading