-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[DAG] ExpandOp_NormalStore - check for bitcasted value that has legal store instead of splitting #171478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… store instead of splitting DAGCombine does attempt this, but we can end up in situations where the bitcast appears after the store has been folded and we don't try again. Noticed while working on some i256/i512 codegen patches that gets cast back from 256/512-bit vectors.
|
@llvm/pr-subscribers-backend-x86 Author: Simon Pilgrim (RKSimon) ChangesDAGCombine does attempt this, but we can end up in situations where the bitcast appears after the store has been folded and we don't try again - someday we'll have better topological sorting :( Noticed while working on some i256/i512 codegen patches that gets cast back from 256/512-bit vectors. Full diff: https://github.com/llvm/llvm-project/pull/171478.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
index 88c1af20a321e..4348f0d6f0aa7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
@@ -479,15 +479,23 @@ SDValue DAGTypeLegalizer::ExpandOp_NormalStore(SDNode *N, unsigned OpNo) {
StoreSDNode *St = cast<StoreSDNode>(N);
assert(!St->isAtomic() && "Atomics can not be split");
- EVT ValueVT = St->getValue().getValueType();
- EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), ValueVT);
SDValue Chain = St->getChain();
+ SDValue Value = St->getValue();
SDValue Ptr = St->getBasePtr();
+ EVT ValueVT = Value.getValueType();
+ EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), ValueVT);
AAMDNodes AAInfo = St->getAAInfo();
assert(NVT.isByteSized() && "Expanded type not byte sized!");
unsigned IncrementSize = NVT.getSizeInBits() / 8;
+ // Storing a bitcasted value, see if the original type is a legal store.
+ // TODO: Not necessary if we had proper topological sorting of nodes.
+ if (Value.getOpcode() == ISD::BITCAST &&
+ TLI.isOperationLegal(ISD::STORE, Value.getOperand(0).getValueType()))
+ return DAG.getStore(Chain, dl, Value.getOperand(0), Ptr,
+ St->getMemOperand());
+
SDValue Lo, Hi;
GetExpandedOp(St->getValue(), Lo, Hi);
diff --git a/llvm/test/CodeGen/X86/atomic-fp.ll b/llvm/test/CodeGen/X86/atomic-fp.ll
index fe79dfe39f645..be67c19dfe111 100644
--- a/llvm/test/CodeGen/X86/atomic-fp.ll
+++ b/llvm/test/CodeGen/X86/atomic-fp.ll
@@ -87,15 +87,11 @@ define dso_local void @fadd_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: faddl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -252,15 +248,11 @@ define dso_local void @fadd_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -415,15 +407,11 @@ define dso_local void @fadd_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -583,15 +571,11 @@ define dso_local void @fadd_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -685,15 +669,11 @@ define dso_local void @fadd_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: faddl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
@@ -859,15 +839,11 @@ define dso_local void @fsub_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fsubl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1024,16 +1000,12 @@ define dso_local void @fsub_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
; X86-NOSSE-NEXT: fchs
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1190,16 +1162,12 @@ define dso_local void @fsub_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
; X86-NOSSE-NEXT: fchs
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1360,15 +1328,11 @@ define dso_local void @fsub_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: fsubl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: fsubl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1464,15 +1428,11 @@ define dso_local void @fsub_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fsubl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
@@ -1638,15 +1598,11 @@ define dso_local void @fmul_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmull 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1800,15 +1756,11 @@ define dso_local void @fmul_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1963,15 +1915,11 @@ define dso_local void @fmul_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2131,15 +2079,11 @@ define dso_local void @fmul_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2233,15 +2177,11 @@ define dso_local void @fmul_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmull 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
@@ -2407,15 +2347,11 @@ define dso_local void @fdiv_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2571,15 +2507,11 @@ define dso_local void @fdiv_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivs {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2734,15 +2666,11 @@ define dso_local void @fdiv_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivs {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2902,15 +2830,11 @@ define dso_local void @fdiv_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: fdivl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: fdivl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -3006,15 +2930,11 @@ define dso_local void @fdiv_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
diff --git a/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll b/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll
index f65461ccee23b..4e7a694f31c9c 100644
--- a/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll
+++ b/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll
@@ -87,8 +87,8 @@ define void @load_single_256bit_elt_vector(ptr %in, ptr %off, ptr %out) nounwind
; SSE-NEXT: xorps %xmm2, %xmm2
; SSE-NEXT: movaps %xmm2, 48(%rdx)
; SSE-NEXT: movaps %xmm2, 32(%rdx)
-; SSE-NEXT: movaps %xmm0, (%rdx)
; SSE-NEXT: movaps %xmm1, 16(%rdx)
+; SSE-NEXT: movaps %xmm0, (%rdx)
; SSE-NEXT: retq
;
; AVX-LABEL: load_single_256bit_elt_vector:
|
|
@llvm/pr-subscribers-llvm-selectiondag Author: Simon Pilgrim (RKSimon) ChangesDAGCombine does attempt this, but we can end up in situations where the bitcast appears after the store has been folded and we don't try again - someday we'll have better topological sorting :( Noticed while working on some i256/i512 codegen patches that gets cast back from 256/512-bit vectors. Full diff: https://github.com/llvm/llvm-project/pull/171478.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
index 88c1af20a321e..4348f0d6f0aa7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
@@ -479,15 +479,23 @@ SDValue DAGTypeLegalizer::ExpandOp_NormalStore(SDNode *N, unsigned OpNo) {
StoreSDNode *St = cast<StoreSDNode>(N);
assert(!St->isAtomic() && "Atomics can not be split");
- EVT ValueVT = St->getValue().getValueType();
- EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), ValueVT);
SDValue Chain = St->getChain();
+ SDValue Value = St->getValue();
SDValue Ptr = St->getBasePtr();
+ EVT ValueVT = Value.getValueType();
+ EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), ValueVT);
AAMDNodes AAInfo = St->getAAInfo();
assert(NVT.isByteSized() && "Expanded type not byte sized!");
unsigned IncrementSize = NVT.getSizeInBits() / 8;
+ // Storing a bitcasted value, see if the original type is a legal store.
+ // TODO: Not necessary if we had proper topological sorting of nodes.
+ if (Value.getOpcode() == ISD::BITCAST &&
+ TLI.isOperationLegal(ISD::STORE, Value.getOperand(0).getValueType()))
+ return DAG.getStore(Chain, dl, Value.getOperand(0), Ptr,
+ St->getMemOperand());
+
SDValue Lo, Hi;
GetExpandedOp(St->getValue(), Lo, Hi);
diff --git a/llvm/test/CodeGen/X86/atomic-fp.ll b/llvm/test/CodeGen/X86/atomic-fp.ll
index fe79dfe39f645..be67c19dfe111 100644
--- a/llvm/test/CodeGen/X86/atomic-fp.ll
+++ b/llvm/test/CodeGen/X86/atomic-fp.ll
@@ -87,15 +87,11 @@ define dso_local void @fadd_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: faddl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -252,15 +248,11 @@ define dso_local void @fadd_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -415,15 +407,11 @@ define dso_local void @fadd_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -583,15 +571,11 @@ define dso_local void @fadd_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -685,15 +669,11 @@ define dso_local void @fadd_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: faddl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
@@ -859,15 +839,11 @@ define dso_local void @fsub_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fsubl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1024,16 +1000,12 @@ define dso_local void @fsub_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
; X86-NOSSE-NEXT: fchs
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1190,16 +1162,12 @@ define dso_local void @fsub_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
; X86-NOSSE-NEXT: fchs
-; X86-NOSSE-NEXT: faddl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: faddl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1360,15 +1328,11 @@ define dso_local void @fsub_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: fsubl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: fsubl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1464,15 +1428,11 @@ define dso_local void @fsub_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fsubl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
@@ -1638,15 +1598,11 @@ define dso_local void @fmul_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmull 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1800,15 +1756,11 @@ define dso_local void @fmul_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -1963,15 +1915,11 @@ define dso_local void @fmul_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2131,15 +2079,11 @@ define dso_local void @fmul_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2233,15 +2177,11 @@ define dso_local void @fmul_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fmull 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
@@ -2407,15 +2347,11 @@ define dso_local void @fdiv_64r(ptr %loc, double %val) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %ecx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl %ecx, (%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%eax)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2571,15 +2507,11 @@ define dso_local void @fdiv_64g() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivs {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll glob64
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2734,15 +2666,11 @@ define dso_local void @fdiv_64imm() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivs {{\.?LCPI[0-9]+_[0-9]+}}
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll -559038737
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -2902,15 +2830,11 @@ define dso_local void @fdiv_64stack() nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %eax, {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %eax, (%esp)
; X86-NOSSE-NEXT: fld1
-; X86-NOSSE-NEXT: fdivl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: fdivl (%esp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %eax
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %ecx
-; X86-NOSSE-NEXT: movl %eax, (%esp)
-; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: movl %ebp, %esp
; X86-NOSSE-NEXT: popl %ebp
@@ -3006,15 +2930,11 @@ define dso_local void @fdiv_array(ptr %arg, double %arg1, i64 %arg2) nounwind {
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp)
+; X86-NOSSE-NEXT: movl %edx, (%esp)
+; X86-NOSSE-NEXT: fldl (%esp)
; X86-NOSSE-NEXT: fdivl 12(%ebp)
; X86-NOSSE-NEXT: fstpl {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %edx
-; X86-NOSSE-NEXT: movl {{[0-9]+}}(%esp), %esi
-; X86-NOSSE-NEXT: movl %edx, (%esp)
-; X86-NOSSE-NEXT: movl %esi, {{[0-9]+}}(%esp)
-; X86-NOSSE-NEXT: fildll (%esp)
+; X86-NOSSE-NEXT: fildll {{[0-9]+}}(%esp)
; X86-NOSSE-NEXT: fistpll (%ecx,%eax,8)
; X86-NOSSE-NEXT: leal -4(%ebp), %esp
; X86-NOSSE-NEXT: popl %esi
diff --git a/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll b/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll
index f65461ccee23b..4e7a694f31c9c 100644
--- a/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll
+++ b/llvm/test/CodeGen/X86/single_elt_vector_memory_operation.ll
@@ -87,8 +87,8 @@ define void @load_single_256bit_elt_vector(ptr %in, ptr %off, ptr %out) nounwind
; SSE-NEXT: xorps %xmm2, %xmm2
; SSE-NEXT: movaps %xmm2, 48(%rdx)
; SSE-NEXT: movaps %xmm2, 32(%rdx)
-; SSE-NEXT: movaps %xmm0, (%rdx)
; SSE-NEXT: movaps %xmm1, 16(%rdx)
+; SSE-NEXT: movaps %xmm0, (%rdx)
; SSE-NEXT: retq
;
; AVX-LABEL: load_single_256bit_elt_vector:
|
| // Storing a bitcasted value, see if the original type is a legal store. | ||
| // TODO: Not necessary if we had proper topological sorting of nodes. | ||
| if (Value.getOpcode() == ISD::BITCAST && | ||
| TLI.isOperationLegal(ISD::STORE, Value.getOperand(0).getValueType())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state of memory instruction legality queries is bad. Does this need to worry about alignment, address space, and everything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a isStoreBitCastBeneficial check as well similar to what DAGCombiner::visitSTORE does - which should do a allowsMemoryAccess check?
| ; X86-NOSSE-NEXT: movl %edx, {{[0-9]+}}(%esp) | ||
| ; X86-NOSSE-NEXT: movl %ecx, {{[0-9]+}}(%esp) | ||
| ; X86-NOSSE-NEXT: fldl {{[0-9]+}}(%esp) | ||
| ; X86-NOSSE-NEXT: movl %ecx, (%esp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks it should be handled in the custom lowering for i64 AtomicStore. Or we need to support f64 AtomicStore without a bitcast. We're spilling to the stack, but we don't need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers, I'll take another look
…iginally legal f64 values that we can store directly. Based off feedback from llvm#171478
DAGCombine does attempt this, but we can end up in situations where the bitcast appears after the store has been folded and we don't try again - someday we'll have better topological sorting :(
Noticed while working on some i256/i512 codegen patches that gets cast back from 256/512-bit vectors.