Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SystemZ Backend: Add support for operations such as FP16_TO_FP and FP_TO_FP16 #50374

Open
kun-lu20 opened this issue Jul 8, 2021 · 8 comments
Assignees
Labels
backend:SystemZ bugzilla Issues migrated from bugzilla

Comments

@kun-lu20
Copy link

kun-lu20 commented Jul 8, 2021

Bugzilla Link 51030
Version unspecified
OS Linux
CC @River707,@ftynse

Extended Description

Hi,

Recently we're running test suite of TensorFlow v2.5.0 on s390x (Ubuntu 18.04).

Test case //tensorflow/compiler/tests:sort_ops_test_cpu fails due to the following error:

LLVM ERROR: Cannot select: 0x3ff14167ca0: f32 = fp16_to_fp 0x3ff14167f10
0x3ff14167f10: i32,ch = load<(dereferenceable load 2 from %ir.4, !alias.scope !​6, !noalias !​4), zext from i16> 0x3ff14197548, 0x3ff141678f8, undef:i64
0x3ff141678f8: i64,ch = load<(load 8 from %ir.3)> 0x3ff14197548, 0x3ff14167890, undef:i64
0x3ff14167890: i64 = add nuw 0x3ff141674e8, Constant:i64<8>
0x3ff141674e8: i64,ch = CopyFromReg 0x3ff14197548, Register:i64 %2
0x3ff14167480: i64 = Register %2
0x3ff14167828: i64 = Constant<8>
0x3ff14167758: i64 = undef
0x3ff14167758: i64 = undef
In function: compare_lt_WCTTAtafbb4__.7

Other test cases such as //tensorflow/python/keras/optimizer_v2:adam_test and //tensorflow/core/kernels/mlir_generated:abs_cpu_f16_f16_gen_test also fail on s390x due to similar reasons. A related issue (tensorflow/tensorflow#44362) has been raised in TensorFlow GitHub issues.

We think the root cause is lack of support for operations such as FP16_TO_FP and FP_TO_FP16 which perform promotions and truncation for half-precision (16 bit) floating numbers in the SystemZ LLVM backend (llvm/lib/Target/SystemZ/SystemZISelLowering.cpp). Could these features be considered to add to SystemZ LLVM backend? Thanks!

@kun-lu20
Copy link
Author

Any updates from the community reg this issue? Thanks!

@joker-eph
Copy link
Collaborator

Moving out of MLIR: this is a backend issue.

@kun-lu20
Copy link
Author

kun-lu20 commented Oct 8, 2021

It looks like commit https://reviews.llvm.org/rG8cd8120a7b5d which has been tagged as 13.0.0-rc could solve this issue, since it adds support for arch14 and operations related to FP16 conversion to the SystemZ backend. Could anyone from community help to confirm this? Thanks!

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
@kun-lu20
Copy link
Author

kun-lu20 commented Apr 7, 2022

This issue still persists on TensorFlow v2.8.0 which uses LLVM 15. Looks like specific half-precision (16 bit) operations are still missing in SystemZ LLVM backend.

Can anyone from community take a look at this issue? Thanks very much!

@kun-lu20
Copy link
Author

kun-lu20 commented Jul 6, 2022

Recently we've run test cases under //tensorflow/core/kernels/mlir_generated category in TensorFlow v2.9.1 and found that this issue still exists.

Looks like FP16/F16 related operations are still unsupported in LLVM SystemZ backend for most Z cpu models, which causes these test cases (such as abs_cpu_f16_f16_gen_test and sqrt_cpu_f64_f64_gen_test) to fail when applyFullConversion() or applyPartialConversion() function is invoked. Although this commit has added FP16 support in the new arch14 (z16) model, it seems that arch14 still doesn't have full support for FP16 operations.

We also found that when building TensorFlow with options -c opt --copt=-O which sets optimization level to 1 and with JIT_Compilation enabled, these test cases would pass and the output .mlir files could be generated successfully.

We think this could be used as a workaround for now, but to address the root cause, FP16 related operations still need to be added to SystemZ backend.

Any thoughts or suggestions from the community reg this issue would be greatly appreciated. Thanks!

@beetrees
Copy link
Contributor

beetrees commented Jun 13, 2024

The following function (compiler explorer):

define half @deref(ptr %p) {
  %x = load half, ptr %p
  ret half %x
}

currently fails to compile when compiling for s390x-unknown-linux-gnu with the following error:

LLVM ERROR: Cannot select: 0x89e9c70: f32,ch = load<(load (s16) from %ir.p), anyext from f16> 0x89a9bc8, 0x89e9c00, undef:i64
  0x89e9c00: i64,ch = CopyFromReg 0x89a9bc8, Register:i64 %0
    0x89e9b90: i64 = Register %0
  0x89e9ce0: i64 = undef
In function: deref
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /opt/compiler-explorer/clang-trunk/bin/llc -o /app/output.s -mtriple=s390x-unknown-linux-gnu <source>
1.	Running pass 'Function Pass Manager' on module '<source>'.
2.	Running pass 'SystemZ DAG->DAG Pattern Instruction Selection' on function '@deref'
 #0 0x00000000037197d8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-trunk/bin/llc+0x37197d8)
 #1 0x000000000371714c SignalHandler(int) Signals.cpp:0:0
 #2 0x00007baf41042520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #3 0x00007baf410969fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
 #4 0x00007baf41042476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
 #5 0x00007baf410287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
 #6 0x000000000073359e llvm::UniqueStringSaver::save(llvm::StringRef) (.cold) StringSaver.cpp:0:0
 #7 0x00000000034dca44 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/opt/compiler-explorer/clang-trunk/bin/llc+0x34dca44)
 #8 0x00000000034e3e85 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/opt/compiler-explorer/clang-trunk/bin/llc+0x34e3e85)
 #9 0x00000000019ed9de (anonymous namespace)::SystemZDAGToDAGISel::Select(llvm::SDNode*) SystemZISelDAGToDAG.cpp:0:0
#10 0x00000000034d9f94 llvm::SelectionDAGISel::DoInstructionSelection() (/opt/compiler-explorer/clang-trunk/bin/llc+0x34d9f94)
#11 0x00000000034e92a1 llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/opt/compiler-explorer/clang-trunk/bin/llc+0x34e92a1)
#12 0x00000000034ebed4 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x34ebed4)
#13 0x00000000034edd44 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x34edd44)
#14 0x00000000019efe0a (anonymous namespace)::SystemZDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) SystemZISelDAGToDAG.cpp:0:0
#15 0x00000000034dd861 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x34dd861)
#16 0x000000000282216b llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#17 0x0000000002d58b22 llvm::FPPassManager::runOnFunction(llvm::Function&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x2d58b22)
#18 0x0000000002d58ca1 llvm::FPPassManager::runOnModule(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x2d58ca1)
#19 0x0000000002d5a950 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/compiler-explorer/clang-trunk/bin/llc+0x2d5a950)
#20 0x000000000084df94 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
#21 0x0000000000745af6 main (/opt/compiler-explorer/clang-trunk/bin/llc+0x745af6)
#22 0x00007baf41029d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
#23 0x00007baf41029e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
#24 0x0000000000845bce _start (/opt/compiler-explorer/clang-trunk/bin/llc+0x845bce)
Program terminated with signal: SIGSEGV
Compiler returned: 139

Other operations involving half also fail with similar errors.

@alexrp
Copy link
Member

alexrp commented Jun 28, 2024

FWIW, this is the only remaining blocker I'm aware of for Zig to be able to target s390x:

zig cc s390x.c -target s390x-linux-musl
LLVM ERROR: Cannot select: 0x6d24170: i32 = fp_to_fp16 0x6d23a00
  0x6d23a00: f32,ch = CopyFromReg 0x5e82a10, Register:f32 %10
    0x6c03da0: f32 = Register %10
In function: __fixhfsi

uweigand added a commit to uweigand/rust that referenced this issue Jul 10, 2024
On s390x, every use of the f16 data type will currently ICE
due to llvm/llvm-project#50374,
causing doctest failures on the platform.

Most doctests were already restricted to certain platforms,
so fix this by likewise restricting the remaining five.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jul 11, 2024
…oss35

core: Limit remaining f16 doctests to x86_64 linux

On s390x, every use of the f16 data type will currently ICE due to llvm/llvm-project#50374, causing doctest failures on the platform.

Most doctests were already restricted to certain platforms, so fix this by likewise restricting the remaining five.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jul 12, 2024
Rollup merge of rust-lang#127588 - uweigand:s390x-f16-doctests, r=tgross35

core: Limit remaining f16 doctests to x86_64 linux

On s390x, every use of the f16 data type will currently ICE due to llvm/llvm-project#50374, causing doctest failures on the platform.

Most doctests were already restricted to certain platforms, so fix this by likewise restricting the remaining five.
alexrp added a commit to alexrp/zig that referenced this issue Aug 12, 2024
andrewrk pushed a commit to ziglang/zig that referenced this issue Aug 12, 2024
SammyJames pushed a commit to SammyJames/zig that referenced this issue Aug 13, 2024
@JonPsson1 JonPsson1 self-assigned this Oct 7, 2024
@JonPsson1
Copy link
Contributor

Patch in progress here: #109164

richerfu pushed a commit to richerfu/zig that referenced this issue Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:SystemZ bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

6 participants