Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PPC/Pwr8] Half-precision conversion from BigInt #39330

Closed
vchuravy opened this issue Jan 20, 2021 · 8 comments
Closed

[PPC/Pwr8] Half-precision conversion from BigInt #39330

vchuravy opened this issue Jan 20, 2021 · 8 comments
Labels
bignums BigInt and BigFloat float16 system:powerpc PowerPC upstream The issue is with an upstream dependency, e.g. LLVM

Comments

@vchuravy
Copy link
Member

vchuravy commented Jan 20, 2021

Conversion from BigInt

gmp                                (78) |         failed at 2021-01-19T21:39:56.822
Test Failed at /nobackup/users/vchuravy/dev/julia/test/gmp.jl:501
  Expression: T(big"2" ^ (n + 1) - big"2" ^ (n - precision(T))) === T(Inf)
   Evaluated: -Inf16 === Inf16

On Power9 julia -C pwr9

julia> T=Float16; n = exponent(floatmax(T))
15

julia> T(big"2" ^ (n + 1) - big"2" ^ (n - precision(T)))
Inf16

On Power9 julia -C pwr8

julia> T=Float16; n = exponent(floatmax(T))
15

julia> T(big"2" ^ (n + 1) - big"2" ^ (n - precision(T)))
-Inf16

Gist: https://gist.github.com/vchuravy/a0170b4d42e46d800aae53b3a80bb582

Duplicate of #38896 which I closed since fixing #38883 made it harder to reproduce until #39300

Upstream: https://bugs.llvm.org/show_bug.cgi?id=49092

@vchuravy vchuravy added system:powerpc PowerPC bignums BigInt and BigFloat float16 labels Jan 20, 2021
@vchuravy vchuravy changed the title [PPC/Pwr8] Half-precision floating operations [PPC/Pwr8] Half-precision conversion from BigInt Jan 20, 2021
@vchuravy
Copy link
Member Author

Reduce to:

source_filename = "g"
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

declare i64 @llvm.ctlz.i64(i64, i1 immarg) #1

define half @julia_g_219(i64 zeroext %0, i64 signext %1, i32 signext %2) {
top:
  %3 = call i64 @llvm.ctlz.i64(i64 %0, i1 false)
  %4 = icmp ugt i64 %0, 65535
  %5 = icmp slt i32 %2, 0
  %6 = sub i32 0, %2
  %7 = select i1 %5, i32 %6, i32 %2
  %8 = icmp sgt i32 %7, 1
  %value_phi.in = or i1 %4, %8
  br i1 %value_phi.in, label %L48, label %L14

L14:                                              ; preds = %top
  %9 = sub nsw i64 52, %3
  %10 = icmp ult i64 %0, 2048
  %11 = lshr i64 %0, %9
  %12 = icmp ugt i64 %9, 63
  %13 = select i1 %12, i64 0, i64 %11
  %14 = sub nsw i64 0, %9
  %15 = shl i64 %0, %14
  %16 = icmp ugt i64 %14, 63
  %17 = select i1 %16, i64 0, i64 %15
  %18 = select i1 %10, i64 %17, i64 %13
  %19 = trunc i64 %18 to i16
  %20 = add i16 %19, 1
  %21 = lshr i16 %20, 1
  %22 = icmp eq i64 %9, %1
  %23 = zext i1 %22 to i16
  %24 = xor i16 %23, -1
  %25 = and i16 %21, %24
  %26 = trunc i64 %3 to i16
  %27 = shl i16 %26, 10
  %28 = sub i16 13312, %27
  %29 = add i16 %28, %25
  %30 = bitcast i16 %29 to half
  br label %L48

L48:                                              ; preds = %L14, %top
  %value_phi1 = phi half [ %30, %L14 ], [ 0xH7C00, %top ]
  %31 = icmp sgt i32 %2, -1
  %32 = fpext half %value_phi1 to float
  %33 = fneg float %32
  %34 = fptrunc float %33 to half
  %35 = select i1 %31, half %value_phi1, half %34
  ret half %35
}

define i32 @main() {
  %val = call half @julia_g_219(i64 65520, i64 4, i32 1)
  %cmp = fcmp oeq half %val, 0xH7C00
  %r = zext i1 %cmp to i32
  ret i32 %r
} 

attributes #1 = { nounwind readnone speculatable willreturn }

Linking it against out internal __gnu_h2f_ieee and __gnu_f2h_ieee.

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -Wl,-rpath=usr/lib -Lusr/lib -ljulia-internal                                                                                                                     
[vchuravy@service0001 jl-pwr8]$ ./a.out ; echo $?
0

Whereas I expected 1.

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll ../toolchain/lib/clang/11.0.1/lib/linux/libclang_rt.builtins-powerpc64le.a 
[vchuravy@service0001 jl-pwr8]$ ./a.out ; echo $?
1

Extracting our implementation of __gnu_h2f_ieee and linking it against that also has the intended result...

@staticfloat does ../toolchain/tools/clang -mcpu=pwr8 main.ll -Wl,-rpath=usr/lib -Lusr/lib -ljulia-internal look kosher? Or am I linking against the trampoline by accident?

@maleadt any ideas? We could think about loading compiler-rt with Orc

@maleadt
Copy link
Member

maleadt commented Jan 20, 2021

We could think about loading compiler-rt with Orc

If linking differently makes our implementation of the intrinsics work, that would just paper over the actual issue, right?
Our current implementation of these intrinsics matches what we used to have in float.jl, and doesn't exactly match what lives in compiler-rt:

// With adjustments for round-to-nearest, ties to even.

@staticfloat
Copy link
Member

does ../toolchain/tools/clang -mcpu=pwr8 main.ll -Wl,-rpath=usr/lib -Lusr/lib -ljulia-internal look kosher? Or am I linking against the trampoline by accident?

No, if you link against julia-internal that's correct.

@vchuravy
Copy link
Member Author

vchuravy commented Jan 20, 2021

Yeah looking at the code linked against libjulia-internal...

Run till exit from #0  __gnu_h2f_ieee (param=31744) at /nobackup/users/vchuravy/dev/julia/src/intrinsics.cpp:1488
0x00000000100008f0 in julia_g_219 ()
Value returned is $1 = -inf
Run till exit from #0  __gnu_h2f_ieee (param=31744) at /nobackup/users/vchuravy/dev/julia/src/intrinsics.cpp:1488
0x0000000010000944 in julia_g_219 ()
Value returned is $5 = inf

inf is the right answer.

@vchuravy
Copy link
Member Author

The plot thickens, compiling half.c with optimizations:

[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -O2 -g2 -mcpu=pwr8 -c -fpic half.c                                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -shared -o libhalf.so half.o                                                                                                                                                         
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf
[vchuravy@service0001 jl-pwr8]$ ./a.out 
[vchuravy@service0001 jl-pwr8]$ echo $?                                                                                                                                                                                                       
0
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -O0 -g2 -mcpu=pwr8 -c -fpic half.c                                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -shared -o libhalf.so half.o                                                                                                                                                         
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ./a.out ; echo $?
1

@vchuravy
Copy link
Member Author

Not limited to Clang.

[vchuravy@service0001 jl-pwr8]$ gcc  -O2 -g2 -mcpu=power8 -c -fpic half.c                                                                                                                                                                     
[vchuravy@service0001 jl-pwr8]$ gcc -shared -o libhalf.so half.o                                                                                                                                                                              
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf
[vchuravy@service0001 jl-pwr8]$ ./a.out; echo $?                                                                                                                                                                                              
0
[vchuravy@service0001 jl-pwr8]$ gcc  -O0 -g2 -mcpu=power8 -c -fpic half.c                                                                                                                                                                     
[vchuravy@service0001 jl-pwr8]$ gcc -shared -o libhalf.so half.o                                                                                                                                                                              
[vchuravy@service0001 jl-pwr8]$ ../toolchain/tools/clang -mcpu=pwr8 main.ll -L`pwd` -Wl,-rpath=`pwd` -lhalf                                                                                                                                   
[vchuravy@service0001 jl-pwr8]$ ./a.out; echo $?
1

@vchuravy
Copy link
Member Author

vchuravy commented Feb 4, 2021

Ok we are passing the argument in r3, which on the call that yields the wrong result is r3 0xffff7c00 4294933504 and on a call where it is the right result r3 0x7c00 31744 so the caller either didn't clear the upper bit or sext it.

@vchuravy vchuravy added the upstream The issue is with an upstream dependency, e.g. LLVM label Feb 8, 2021
@vchuravy
Copy link
Member Author

Fixed by #39712

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bignums BigInt and BigFloat float16 system:powerpc PowerPC upstream The issue is with an upstream dependency, e.g. LLVM
Projects
None yet
Development

No branches or pull requests

3 participants