Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86 wrong code for floating point comparison #63055

Closed
cbeuw opened this issue Jun 1, 2023 · 3 comments
Closed

x86 wrong code for floating point comparison #63055

cbeuw opened this issue Jun 1, 2023 · 3 comments

Comments

@cbeuw
Copy link

cbeuw commented Jun 1, 2023

From rust-lang/rust#112170

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" 
target triple = "x86_64-unknown-linux-gnu" 
 
define void @fn1() { 
start: 
  %_6 = fadd double 0.000000e+00, 0.000000e+00 
  br label %bb2.outer1464 
 
bb2.outer1464:                                    ; preds = %bb2.outer1464, %start 
  %0 = icmp eq i32 0, 0 
  br i1 %0, label %bb15.preheader, label %bb2.outer1464 
 
bb15.preheader:                                   ; preds = %bb2.outer1464 
  %_23.le = fcmp une double 0x7FF8000000000000, %_6 
  %_30 = zext i1 %_23.le to i64 
  tail call void @print_var(i64 %_30) 
  ret void 
} 
 
declare void @print_var(i64) 

%_30 should be 1, as NaN != 0 is true, however on x86 print_var gets called with 255:

fn1:                                    # @fn1
        xorps   xmm0, xmm0
        addsd   xmm0, xmm0
        mov     edi, 255
        jmp     print_var@PLT                   # TAILCALL

https://godbolt.org/z/5f447cnfn

opt -O3 is correct:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @fn1() local_unnamed_addr {
  tail call void @print_var(i64 1)
  ret void
}

declare void @print_var(i64) local_unnamed_addr
@nikic
Copy link
Contributor

nikic commented Jun 1, 2023

Reduced:

define i64 @test(double %arg) { 
  %fcmp = fcmp une double 0x7FF8000000000000, %arg
  %ext = zext i1 %fcmp to i64 
  ret i64 %ext
}

After type legalization:

Type-legalized selection DAG: %bb.0 'test:'
SelectionDAG has 13 nodes:
  t0: ch,glue = EntryToken
          t2: f64,ch = CopyFromReg t0, Register:f64 %0
        t13: i8 = setcc ConstantFP:f64<nan>, t2, setune:ch
      t14: i64 = any_extend t13
    t16: i64 = and t14, Constant:i64<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t16
  t10: ch = X86ISD::RET_GLUE t9, TargetConstant:i32<0>, Register:i64 $rax, t9:1

After DAGCombining:

Optimized type-legalized selection DAG: %bb.0 'test:'
SelectionDAG has 11 nodes:
  t0: ch,glue = EntryToken
        t2: f64,ch = CopyFromReg t0, Register:f64 %0
      t13: i8 = setcc ConstantFP:f64<nan>, t2, setune:ch
    t17: i64 = zero_extend t13
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t17
  t10: ch = X86ISD::RET_GLUE t9, TargetConstant:i32<0>, Register:i64 $rax, t9:1

After legalization:

Legalized selection DAG: %bb.0 'test:'
SelectionDAG has 7 nodes:
    t0: ch,glue = EntryToken
    t17: i64 = zero_extend Constant:i8<-1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t17
  t10: ch = X86ISD::RET_GLUE t9, TargetConstant:i32<0>, Register:i64 $rax, t9:1

Either the combine or the legalization are wrong, depending on BooleanContents.

@nikic
Copy link
Contributor

nikic commented Jun 1, 2023

The problem is

. This getUNDEF() does not produce legal boolean contents.

@nikic
Copy link
Contributor

nikic commented Jun 1, 2023

Candidate patch: https://reviews.llvm.org/D151883

@nikic nikic closed this as completed in e506bfa Jun 1, 2023
nikic added a commit to rust-lang/llvm-project that referenced this issue Jun 5, 2023
FoldSetCC() returns UNDEF in a number of cases. However, the SetCC
result must follow BooleanContents. Unless the type is a
pre-legalization i1 or we have UndefinedBooleanContents, the use of
UNDEF will not uphold the requirement that the top bits are either
zero or match the low bit. In such cases, return zero instead.

Fixes llvm#63055.

Differential Revision: https://reviews.llvm.org/D151883

(cherry picked from commit e506bfa)
veselypeta pushed a commit to veselypeta/cherillvm that referenced this issue Aug 27, 2024
FoldSetCC() returns UNDEF in a number of cases. However, the SetCC
result must follow BooleanContents. Unless the type is a
pre-legalization i1 or we have UndefinedBooleanContents, the use of
UNDEF will not uphold the requirement that the top bits are either
zero or match the low bit. In such cases, return zero instead.

Fixes llvm/llvm-project#63055.

Differential Revision: https://reviews.llvm.org/D151883
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants