[X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS/FNEG/FCOPYSIGN (f16) with Expand #106153

v01dXYZ · 2024-08-26T22:34:48Z

For the concerned targets (X86, NVPTX), Expand clears the sign bit after bitcasting to i16. This is cheaper than Promote that requires converting back and forth from f16 to f32 (with possibly expensive libcalls) in order to call FABS.f32

Fixes #104915.

llvmbot · 2024-08-26T22:35:18Z

@llvm/pr-subscribers-backend-nvptx

@llvm/pr-subscribers-backend-x86

Author: None (v01dXYZ)

Changes

For the concerned targets (X86, NVPTX), Expand clears the sign bit after bitcasting to i16. This is cheaper than Promote that requires converting back and forth from f16 to f32 (with possibly expensive libcalls) in order to call FABS.f32

Full diff: https://github.com/llvm/llvm-project/pull/106153.diff

6 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp (+2-1)
(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+2-1)
(modified) llvm/test/CodeGen/NVPTX/f16-instructions.ll (+4-8)
(modified) llvm/test/CodeGen/NVPTX/f16x2-instructions.ll (+7-11)
(modified) llvm/test/CodeGen/X86/fp16-libcalls.ll (+7-28)
(modified) llvm/test/CodeGen/X86/half.ll (+5-7)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 73791102fc04de..5b0a87d5f150f8 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -849,7 +849,8 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
     AddPromotedToType(Op, MVT::bf16, MVT::f32);
   }
   for (const auto &Op : {ISD::FABS}) {
-    setOperationAction(Op, MVT::f16, Promote);
+    // Expand instead of Promote to clear sign bit by bitcasting to i16
+    setOperationAction(Op, MVT::f16, Expand);
     setOperationAction(Op, MVT::f32, Legal);
     setOperationAction(Op, MVT::f64, Legal);
     setOperationAction(Op, MVT::v2f16, Expand);
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 1a6be4eb5af1ef..274a03b6664736 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -599,7 +599,8 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
   setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f64, Legal);
 
   auto setF16Action = [&] (MVT VT, LegalizeAction Action) {
-    setOperationAction(ISD::FABS, VT, Action);
+    // Expand instead of Promote to clear sign bit by bitcasting to i16.
+    setOperationAction(ISD::FABS, VT, Expand);
     setOperationAction(ISD::FNEG, VT, Action);
     setOperationAction(ISD::FCOPYSIGN, VT, Expand);
     setOperationAction(ISD::FREM, VT, Action);
diff --git a/llvm/test/CodeGen/NVPTX/f16-instructions.ll b/llvm/test/CodeGen/NVPTX/f16-instructions.ll
index 14e02a49f6e5e4..cfabf1d6639c2a 100644
--- a/llvm/test/CodeGen/NVPTX/f16-instructions.ll
+++ b/llvm/test/CodeGen/NVPTX/f16-instructions.ll
@@ -981,14 +981,10 @@ define half @test_fma(half %a, half %b, half %c) #0 {
 }
 
 ; CHECK-LABEL: test_fabs(
-; CHECK:      ld.param.b16    [[A:%rs[0-9]+]], [test_fabs_param_0];
-; CHECK-NOFTZ:      cvt.f32.f16     [[AF:%f[0-9]+]], [[A]];
-; CHECK-NOFTZ:      abs.f32         [[RF:%f[0-9]+]], [[AF]];
-; CHECK-F16-FTZ:      cvt.ftz.f32.f16     [[AF:%f[0-9]+]], [[A]];
-; CHECK-F16-FTZ:      abs.ftz.f32         [[RF:%f[0-9]+]], [[AF]];
-; CHECK:      cvt.rn.f16.f32  [[R:%rs[0-9]+]], [[RF]];
-; CHECK:      st.param.b16    [func_retval0+0], [[R]];
-; CHECK:      ret;
+; CHECK:    ld.param.b16    [[A:%rs[0-9]+]], [test_fabs_param_0];
+; CHECK:    and.b16 [[RF:%rs[0-9]+]], [[A]], 32767;
+; CHECK:    st.param.b16 [func_retval0+0], [[RF]];
+; CHECK:    ret;
 define half @test_fabs(half %a) #0 {
   %r = call half @llvm.fabs.f16(half %a)
   ret half %r
diff --git a/llvm/test/CodeGen/NVPTX/f16x2-instructions.ll b/llvm/test/CodeGen/NVPTX/f16x2-instructions.ll
index 464b3a754804fe..c2189a20c98c25 100644
--- a/llvm/test/CodeGen/NVPTX/f16x2-instructions.ll
+++ b/llvm/test/CodeGen/NVPTX/f16x2-instructions.ll
@@ -1183,17 +1183,13 @@ define <2 x half> @test_fma(<2 x half> %a, <2 x half> %b, <2 x half> %c) #0 {
 }
 
 ; CHECK-LABEL: test_fabs(
-; CHECK:      ld.param.b32    [[A:%r[0-9]+]], [test_fabs_param_0];
-; CHECK:      mov.b32         {[[A0:%rs[0-9]+]], [[A1:%rs[0-9]+]]}, [[A]]
-; CHECK-DAG:  cvt.f32.f16     [[AF0:%f[0-9]+]], [[A0]];
-; CHECK-DAG:  cvt.f32.f16     [[AF1:%f[0-9]+]], [[A1]];
-; CHECK-DAG:  abs.f32         [[RF0:%f[0-9]+]], [[AF0]];
-; CHECK-DAG:  abs.f32         [[RF1:%f[0-9]+]], [[AF1]];
-; CHECK-DAG:  cvt.rn.f16.f32  [[R0:%rs[0-9]+]], [[RF0]];
-; CHECK-DAG:  cvt.rn.f16.f32  [[R1:%rs[0-9]+]], [[RF1]];
-; CHECK:      mov.b32         [[R:%r[0-9]+]], {[[R0]], [[R1]]}
-; CHECK:      st.param.b32    [func_retval0+0], [[R]];
-; CHECK:      ret;
+; CHECK:    ld.param.b32    [[A:%r[0-9]+]], [test_fabs_param_0];
+; CHECK:    mov.b32         {[[A0:%rs[0-9]+]], [[A1:%rs[0-9]+]]}, [[A]]
+; CHECK:    and.b16 [[A2:%rs[0-9]+]], [[A1]], 32767;
+; CHECK:    and.b16 [[A3:%rs[0-9]+]], [[A0]], 32767;
+; CHECK:    mov.b32 [[B:%r[0-9]+]], {[[A3]], [[A2]]};
+; CHECK:    st.param.b32 [func_retval0+0], [[B]];
+; CHECK:    ret;
 define <2 x half> @test_fabs(<2 x half> %a) #0 {
   %r = call <2 x half> @llvm.fabs.f16(<2 x half> %a)
   ret <2 x half> %r
diff --git a/llvm/test/CodeGen/X86/fp16-libcalls.ll b/llvm/test/CodeGen/X86/fp16-libcalls.ll
index db3d031a8fe3fb..ee342f22eeff66 100644
--- a/llvm/test/CodeGen/X86/fp16-libcalls.ll
+++ b/llvm/test/CodeGen/X86/fp16-libcalls.ll
@@ -124,11 +124,7 @@ define void @test_half_fabs(half %a0, ptr %p0) nounwind {
 ; F16C-LABEL: test_half_fabs:
 ; F16C:       # %bb.0:
 ; F16C-NEXT:    vpextrw $0, %xmm0, %eax
-; F16C-NEXT:    vmovd %eax, %xmm0
-; F16C-NEXT:    vcvtph2ps %xmm0, %xmm0
-; F16C-NEXT:    vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
-; F16C-NEXT:    vcvtps2ph $4, %xmm0, %xmm0
-; F16C-NEXT:    vmovd %xmm0, %eax
+; F16C-NEXT:    andl $32767, %eax # imm = 0x7FFF
 ; F16C-NEXT:    movw %ax, (%rdi)
 ; F16C-NEXT:    retq
 ;
@@ -141,34 +137,17 @@ define void @test_half_fabs(half %a0, ptr %p0) nounwind {
 ;
 ; X64-LABEL: test_half_fabs:
 ; X64:       # %bb.0:
-; X64-NEXT:    pushq %rbx
-; X64-NEXT:    movq %rdi, %rbx
-; X64-NEXT:    callq __extendhfsf2@PLT
-; X64-NEXT:    pand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
-; X64-NEXT:    callq __truncsfhf2@PLT
 ; X64-NEXT:    pextrw $0, %xmm0, %eax
-; X64-NEXT:    movw %ax, (%rbx)
-; X64-NEXT:    popq %rbx
+; X64-NEXT:    andl $32767, %eax # imm = 0x7FFF
+; X64-NEXT:    movw %ax, (%rdi)
 ; X64-NEXT:    retq
 ;
 ; X86-LABEL: test_half_fabs:
 ; X86:       # %bb.0:
-; X86-NEXT:    pushl %esi
-; X86-NEXT:    subl $8, %esp
-; X86-NEXT:    pinsrw $0, {{[0-9]+}}(%esp), %xmm0
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %esi
-; X86-NEXT:    pextrw $0, %xmm0, %eax
-; X86-NEXT:    movw %ax, (%esp)
-; X86-NEXT:    calll __extendhfsf2
-; X86-NEXT:    fstps {{[0-9]+}}(%esp)
-; X86-NEXT:    movd {{.*#+}} xmm0 = mem[0],zero,zero,zero
-; X86-NEXT:    pand {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0
-; X86-NEXT:    movd %xmm0, (%esp)
-; X86-NEXT:    calll __truncsfhf2
-; X86-NEXT:    pextrw $0, %xmm0, %eax
-; X86-NEXT:    movw %ax, (%esi)
-; X86-NEXT:    addl $8, %esp
-; X86-NEXT:    popl %esi
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    movzwl {{[0-9]+}}(%esp), %ecx
+; X86-NEXT:    andl $32767, %ecx # imm = 0x7FFF
+; X86-NEXT:    movw %cx, (%eax)
 ; X86-NEXT:    retl
   %res = call half @llvm.fabs.half(half %a0)
   store half %res, ptr %p0, align 2
diff --git a/llvm/test/CodeGen/X86/half.ll b/llvm/test/CodeGen/X86/half.ll
index 9f01d07e6a6705..6838925240058e 100644
--- a/llvm/test/CodeGen/X86/half.ll
+++ b/llvm/test/CodeGen/X86/half.ll
@@ -1059,7 +1059,6 @@ define void @main.158() #0 {
 ; CHECK-LIBCALL:       # %bb.0: # %entry
 ; CHECK-LIBCALL-NEXT:    pushq %rax
 ; CHECK-LIBCALL-NEXT:    xorps %xmm0, %xmm0
-; CHECK-LIBCALL-NEXT:    callq __truncsfhf2@PLT
 ; CHECK-LIBCALL-NEXT:    callq __extendhfsf2@PLT
 ; CHECK-LIBCALL-NEXT:    movss {{.*#+}} xmm1 = [8.0E+0,0.0E+0,0.0E+0,0.0E+0]
 ; CHECK-LIBCALL-NEXT:    ucomiss %xmm0, %xmm1
@@ -1077,10 +1076,10 @@ define void @main.158() #0 {
 ; BWON-F16C-LABEL: main.158:
 ; BWON-F16C:       # %bb.0: # %entry
 ; BWON-F16C-NEXT:    vxorps %xmm0, %xmm0, %xmm0
-; BWON-F16C-NEXT:    vcvtps2ph $4, %xmm0, %xmm1
-; BWON-F16C-NEXT:    vcvtph2ps %xmm1, %xmm1
-; BWON-F16C-NEXT:    vmovss {{.*#+}} xmm2 = [8.0E+0,0.0E+0,0.0E+0,0.0E+0]
-; BWON-F16C-NEXT:    vucomiss %xmm1, %xmm2
+; BWON-F16C-NEXT:    vcvtph2ps %xmm0, %xmm0
+; BWON-F16C-NEXT:    vmovss {{.*#+}} xmm1 = [8.0E+0,0.0E+0,0.0E+0,0.0E+0]
+; BWON-F16C-NEXT:    vucomiss %xmm0, %xmm1
+; BWON-F16C-NEXT:    vxorps %xmm0, %xmm0, %xmm0
 ; BWON-F16C-NEXT:    jae .LBB20_2
 ; BWON-F16C-NEXT:  # %bb.1: # %entry
 ; BWON-F16C-NEXT:    vmovss {{.*#+}} xmm0 = [NaN,0.0E+0,0.0E+0,0.0E+0]
@@ -1093,8 +1092,7 @@ define void @main.158() #0 {
 ; CHECK-I686-LABEL: main.158:
 ; CHECK-I686:       # %bb.0: # %entry
 ; CHECK-I686-NEXT:    subl $12, %esp
-; CHECK-I686-NEXT:    movl $0, (%esp)
-; CHECK-I686-NEXT:    calll __truncsfhf2
+; CHECK-I686-NEXT:    pxor %xmm0, %xmm0
 ; CHECK-I686-NEXT:    pextrw $0, %xmm0, %eax
 ; CHECK-I686-NEXT:    movw %ax, (%esp)
 ; CHECK-I686-NEXT:    calll __extendhfsf2

github-actions · 2024-08-26T22:38:38Z

✅ With the latest revision this PR passed the C/C++ code formatter.

phoebewang

LGTM for X86 change, thanks!

Artem-B

LGTM.

Artem-B · 2024-08-27T17:45:34Z

llvm/test/CodeGen/NVPTX/f16x2-instructions.ll

+; CHECK:    mov.b32         {[[A0:%rs[0-9]+]], [[A1:%rs[0-9]+]]}, [[A]]
+; CHECK:    and.b16 [[A2:%rs[0-9]+]], [[A1]], 32767;
+; CHECK:    and.b16 [[A3:%rs[0-9]+]], [[A0]], 32767;
+; CHECK:    mov.b32 [[B:%r[0-9]+]], {[[A3]], [[A2]]};


Technically this could be improved further by performing and on the i32 directly. I suspect ptxas may do that for us.

Maybe add a TODO note.

I think a TODO is fine, but that we should not rely on ptxas doing it for us.

LegalizeType would type-expand FABS (<2 x f16>).

The DAG just before instruction selection:

Optimized legalized selection DAG: %bb.0 'test_fabs:' SelectionDAG has 21 nodes: t0: ch,glue = EntryToken t14: v2f16,ch = load<(dereferenceable invariant load (s32) from `ptr addrspace(101) null`, addrspace 101)> t0, TargetExternalSymbol:i64'test_fabs_param_0', undef:i64 t8: ch = CopyToReg t0, Register:v2f16 %0, t14 t16: f16 = extract_vector_elt t14, Constant:i64<0> t22: i16 = bitcast t16 t24: i16 = and t22, Constant:i16<32767> t25: f16 = bitcast t24 t19: f16 = extract_vector_elt t14, Constant:i64<1> t26: i16 = bitcast t19 t27: i16 = and t26, Constant:i16<32767> t28: f16 = bitcast t27 t21: v2f16 = BUILD_VECTOR t25, t28 t11: ch = NVPTXISD::StoreRetval<(store (s32), align 1)> t8, Constant:i32<0>, t21 t12: ch = NVPTXISD::RET_GLUE t11

Maybe detect a composition of bitwise operations on each element and merge it into a composition on the bitcasted vector itself (starting from BUILD_VECTOR).

BTW that's also the case with x86:

define <2 x half> @test_fabs(<2 x half> %a) #0 { %r = call <2 x half> @llvm.fabs.f16(<2 x half> %a) ret <2 x half> %r }

.text .file "fabs_nvptx.ll" .globl test_fabs # -- Begin function test_fabs .p2align 4, 0x90 .type test_fabs,@function test_fabs: # @test_fabs .cfi_startproc # %bb.0: pextrw $0, %xmm0, %eax psrld $16, %xmm0 pextrw $0, %xmm0, %ecx andl $32767, %ecx # imm = 0x7FFF pinsrw $0, %ecx, %xmm1 andl $32767, %eax # imm = 0x7FFF pinsrw $0, %eax, %xmm0 punpcklwd %xmm1, %xmm0 # xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1],xmm0[2],xmm1[2],xmm0[3],xmm1[3] retq .Lfunc_end0: .size test_fabs, .Lfunc_end0-test_fabs .cfi_endproc # -- End function .section ".note.GNU-stack","",@progbits

But AArch64 seems not to do so (there is a combiner that combine the build_vector into a concat_vector ..., undef).

Let's not get sidetracked. This is something for a separate patch or github issue, and would be best discussed there.

A comment highlighting a future optimization opportunity is all that we need in this patch.

Artem-B · 2024-08-27T17:48:32Z

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

@@ -849,7 +849,8 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
    AddPromotedToType(Op, MVT::bf16, MVT::f32);
  }
  for (const auto &Op : {ISD::FABS}) {
-    setOperationAction(Op, MVT::f16, Promote);
+    // Expand instead of Promote to clear sign bit by bitcasting to i16
+    setOperationAction(Op, MVT::f16, Expand);


Would operations other than FABS benefit from that, too? E.g. fneg or fcopysign?

that's a valid point.

For NVPTX, we have for f16 (so no change required)

FNEG Legal / Expand

FCOPYSIGN Expand

For X86-64, we have for f16

FNEG Promote

FCOPYSIGN Expand

v01dXYZ · 2024-08-27T21:58:45Z

@phoebewang I added FNEG/FCOPYSIGN. I rewrote a little bit the code to avoid having special cases within setF16Action (not more efficient but simpler).

Artem-B · 2024-08-27T22:23:00Z

llvm/lib/Target/X86/X86ISelLowering.cpp

+    // Expand instead of Promote to clear/flip/copy sign bit by bitcasting to
+    // i16.
+    setOperationAction(ISD::FABS, MVT::f16, Expand);
+    setOperationAction(ISD::FNEG, MVT::f16, Expand);


X86 could use test cases for fneg/fcopysign.

phoebewang · 2024-08-28T01:08:18Z

llvm/lib/Target/X86/X86ISelLowering.cpp

@@ -601,7 +601,7 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
  auto setF16Action = [&] (MVT VT, LegalizeAction Action) {
    setOperationAction(ISD::FABS, VT, Action);
    setOperationAction(ISD::FNEG, VT, Action);
-    setOperationAction(ISD::FCOPYSIGN, VT, Expand);
+    setOperationAction(ISD::FCOPYSIGN, VT, Action);


What's the reason to change this? And why don't set above Expand here?

Because it would create even more special cases that don't reflect the function name.

The function is called elsewhere with Expand unless it's f16.

That's just for consistency, I could do it here if it's better.

This function is called not only by scalar f16 but also both vector FP16 and BF16, though we don't have tests to cover them.

When it is for a vector FP/BF, it is always called with Expand.

Thanks, then it's fine for me :)

Wait.. should we remove it from setF16Action instead?

Wait.. should we remove it from setF16Action instead?

Sorry, I misunderstood. No problem then.

We can choose b/w:

isolate those Opcodes into another function.

keep them and overwrite the actions just after calling setF16Action if they are the right actions (which is the path I've taken, and that's what's in the current code with MVT::v16f16 for example).

I was thinking you meant FCOPYSIGN is set Expand by default, then I got you mean it's all set Expand in X86ISelLowering.cpp.
Just have a double check, it does set Expand by default, see
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L775
So we can simply remove it from setF16Action.

That's right. The less code the better.

RKSimon

please can you rebase? I've already added additional test coverage to fp16-libcalls.ll, including copysign/fabs/fneg etc.

…(f16) with Expand For the concerned targets, `Expand` clears/flips/copies the sign bit after bitcasting to i16. This is cheaper than `Promote` that requires converting back and forth from f16 to f32 (with possibly expensive libcalls) in order to call FABS/FNEG/FCOPYSIGN (f32). Reminder: By default, FCOPYSIGN action is Expand.

RKSimon

LGTM - cheers

arsenm

This is still papering over a bug - the promote action is just busted and needs to be replaced or removed

v01dXYZ · 2024-08-28T17:48:43Z

@arsenm

Yes, promote seems to be overkill for such operations that could be implemented as bitwise ops after move back and forth to a GPR.

For targets with f16 and i16 registers, the natural way to lower those ops is with a bitcast (unless there are native instructions for that of course). In this case, removing Promote is sound.
Concerning the case of an arch with f16 and without i16, we should instead use the smallest fitting integer type rather than promotion.

The weird case would be an arch having f16 registers but only GPR registers that are strictly smaller than i16.

arsenm · 2024-08-28T17:59:36Z

For targets with f16 have a valid i16, the natural way to lower those ops is with a bitcast (unless there are native instructions for that of course). In this case, removing Promote is sound.

The problem isn't that it's overkill, it's that it is wrong. You cannot have the canonicalization effects of the fpextend.

Concerning the case of an arch with f16 and without i16, we should instead use the smallest fitting integer type rather than promotion.

This is more of a ridiculous DAG design issue; you should be able to create the bitcast and have later legalization steps deal with it. Failing that, you could do a replacement bitcast by doing store to stack and reload

v01dXYZ · 2024-08-28T18:16:33Z

Indeed, LangRef clearly mentions the fact that only the bit sign is changed with fabs/fneg/fcopysign but says nothing about that with fpext/fptrunc.

llvmbot added backend:X86 backend:NVPTX labels Aug 26, 2024

v01dXYZ mentioned this pull request Aug 26, 2024

[LegalizeDAG] PromoteNode FABS: Override promotion with clear sign bit #106076

Closed

topperc requested review from RKSimon and phoebewang August 26, 2024 22:45

justinfargnoli self-requested a review August 26, 2024 22:51

phoebewang approved these changes Aug 27, 2024

View reviewed changes

justinfargnoli requested a review from Artem-B August 27, 2024 07:51

Artem-B approved these changes Aug 27, 2024

View reviewed changes

v01dXYZ force-pushed the 104915-legalize-dag-if-i16-legal-then-legalize-fabs-f16-with-expand branch from 87097a5 to 00d591c Compare August 27, 2024 21:55

v01dXYZ requested a review from phoebewang August 27, 2024 21:55

Artem-B reviewed Aug 27, 2024

View reviewed changes

v01dXYZ force-pushed the 104915-legalize-dag-if-i16-legal-then-legalize-fabs-f16-with-expand branch from 00d591c to ff5805c Compare August 27, 2024 22:52

v01dXYZ changed the title ~~[X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS.f16 with Expand~~ [X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS/FNEG/FCOPYSIGN (f16) with Expand Aug 27, 2024

Artem-B approved these changes Aug 27, 2024

View reviewed changes

phoebewang reviewed Aug 28, 2024

View reviewed changes

phoebewang approved these changes Aug 28, 2024

View reviewed changes

v01dXYZ force-pushed the 104915-legalize-dag-if-i16-legal-then-legalize-fabs-f16-with-expand branch 4 times, most recently from f38c1c9 to 04e7cea Compare August 28, 2024 02:08

phoebewang approved these changes Aug 28, 2024

View reviewed changes

RKSimon requested changes Aug 28, 2024

View reviewed changes

v01dXYZ force-pushed the 104915-legalize-dag-if-i16-legal-then-legalize-fabs-f16-with-expand branch from 04e7cea to d8a6a62 Compare August 28, 2024 11:42

RKSimon approved these changes Aug 28, 2024

View reviewed changes

arsenm reviewed Aug 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS/FNEG/FCOPYSIGN (f16) with Expand #106153

[X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS/FNEG/FCOPYSIGN (f16) with Expand #106153

v01dXYZ commented Aug 26, 2024 •

edited

Loading

llvmbot commented Aug 26, 2024 •

edited

Loading

github-actions bot commented Aug 26, 2024 •

edited

Loading

phoebewang left a comment

Artem-B left a comment

Artem-B Aug 27, 2024

gonzalobg Aug 27, 2024

v01dXYZ Aug 27, 2024

Artem-B Aug 28, 2024

Artem-B Aug 27, 2024

v01dXYZ Aug 27, 2024

v01dXYZ commented Aug 27, 2024 •

edited

Loading

Artem-B Aug 27, 2024

v01dXYZ Aug 27, 2024

phoebewang Aug 28, 2024

v01dXYZ Aug 28, 2024

phoebewang Aug 28, 2024

v01dXYZ Aug 28, 2024 •

edited

Loading

phoebewang Aug 28, 2024

phoebewang Aug 28, 2024

phoebewang Aug 28, 2024

v01dXYZ Aug 28, 2024 •

edited

Loading

phoebewang Aug 28, 2024

v01dXYZ Aug 28, 2024

RKSimon left a comment

RKSimon left a comment

arsenm left a comment

v01dXYZ commented Aug 28, 2024 •

edited

Loading

arsenm commented Aug 28, 2024

v01dXYZ commented Aug 28, 2024 •

edited

Loading

[X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS/FNEG/FCOPYSIGN (f16) with Expand #106153

Are you sure you want to change the base?

[X86][NVPTX][LegalizeDAG] If i16 legal, legalize FABS/FNEG/FCOPYSIGN (f16) with Expand #106153

Conversation

v01dXYZ commented Aug 26, 2024 • edited Loading

llvmbot commented Aug 26, 2024 • edited Loading

github-actions bot commented Aug 26, 2024 • edited Loading

phoebewang left a comment

Choose a reason for hiding this comment

Artem-B left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v01dXYZ commented Aug 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v01dXYZ Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v01dXYZ Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RKSimon left a comment

Choose a reason for hiding this comment

RKSimon left a comment

Choose a reason for hiding this comment

arsenm left a comment

Choose a reason for hiding this comment

v01dXYZ commented Aug 28, 2024 • edited Loading

arsenm commented Aug 28, 2024

v01dXYZ commented Aug 28, 2024 • edited Loading

v01dXYZ commented Aug 26, 2024 •

edited

Loading

llvmbot commented Aug 26, 2024 •

edited

Loading

github-actions bot commented Aug 26, 2024 •

edited

Loading

v01dXYZ commented Aug 27, 2024 •

edited

Loading

v01dXYZ Aug 28, 2024 •

edited

Loading

v01dXYZ Aug 28, 2024 •

edited

Loading

v01dXYZ commented Aug 28, 2024 •

edited

Loading

v01dXYZ commented Aug 28, 2024 •

edited

Loading