Skip to content

Commit 4d204a4

Browse files
committed
[SelectionDAG] Tidy up around endianness and isConstantSplat
The BuildVectorSDNode::isConstantSplat function could depend on endianness, and it takes a bool argument that can be used to indicate if big or little endian should be considered when internally casting from a vector to a scalar. However, that argument is default set to false (= little endian). And in many situations, even in target generic code such as DAGCombiner, the endianness isn't specified when using the function. The intent with this patch is to highlight that endianness doesn't matter, depending on the context in which the function is used. In DAGCombiner the code is slightly refactored. Back in the days when the code was written it wasn't possible to request a MinSplatBits size when calling isConstantSplat. Instead the code re-expanded the found SplatValue to match with the EltBitWidth. Now we can just provide EltBitWidth as MinSplatBits and remove the logic for doing the re-expand. While being at it, tidying up around isConstantSplat, this patch also adds an explicit check in BuildVectorSDNode::isConstantSplat to break out from the loop if trying to split an on VecWidth into two halves. Haven't been able to prove that there could be miscompiles involved if not doing so. There are lit tests that trigger that scenario, although I think they happen to later discard the returned SplatValue for other reasons.
1 parent 499d41c commit 4d204a4

File tree

4 files changed

+40
-26
lines changed

4 files changed

+40
-26
lines changed

llvm/docs/LangRef.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3888,7 +3888,7 @@ integer to memory.
38883888

38893889
A bitcast from a vector type to a scalar integer type will see the elements
38903890
being packed together (without padding). The order in which elements are
3891-
inserted in the integer depends on endianess. For little endian element zero
3891+
inserted in the integer depends on endianness. For little endian element zero
38923892
is put in the least significant bits of the integer, and for big endian
38933893
element zero is put in the most significant bits.
38943894

@@ -11677,7 +11677,7 @@ To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
1167711677
or :ref:`ptrtoint <i_ptrtoint>` instructions first.
1167811678

1167911679
There is a caveat for bitcasts involving vector types in relation to
11680-
endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
11680+
endianness. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
1168111681
of the vector in the least significant bits of the i16 for little-endian while
1168211682
element zero ends up in the most significant bits for big-endian.
1168311683

@@ -11686,9 +11686,9 @@ Example:
1168611686

1168711687
.. code-block:: text
1168811688

11689-
%X = bitcast i8 255 to i8 ; yields i8 :-1
11690-
%Y = bitcast i32* %x to i16* ; yields i16*:%x
11691-
%Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianess)
11689+
%X = bitcast i8 255 to i8 ; yields i8 :-1
11690+
%Y = bitcast i32* %x to i16* ; yields i16*:%x
11691+
%Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianness)
1169211692
%Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
1169311693

1169411694
.. _i_addrspacecast:

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Lines changed: 17 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7076,12 +7076,23 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
70767076
N1, /*AllowUndef=*/false, /*AllowTruncation=*/true)) {
70777077
Constant = C->getAPIntValue();
70787078
} else if (BuildVectorSDNode *Vector = dyn_cast<BuildVectorSDNode>(N1)) {
7079+
unsigned EltBitWidth = Vector->getValueType(0).getScalarSizeInBits();
70797080
APInt SplatValue, SplatUndef;
70807081
unsigned SplatBitSize;
70817082
bool HasAnyUndefs;
7082-
bool IsSplat = Vector->isConstantSplat(SplatValue, SplatUndef,
7083-
SplatBitSize, HasAnyUndefs);
7084-
if (IsSplat) {
7083+
// Endianness should not matter here. Code below makes sure that we only
7084+
// use the result if the SplatBitSize is a multiple of the vector element
7085+
// size. And after that we AND all element sized parts of the splat
7086+
// together. So the end result should be the same regardless of in which
7087+
// order we do those operations.
7088+
const bool IsBigEndian = false;
7089+
bool IsSplat =
7090+
Vector->isConstantSplat(SplatValue, SplatUndef, SplatBitSize,
7091+
HasAnyUndefs, EltBitWidth, IsBigEndian);
7092+
7093+
// Make sure that variable 'Constant' is only set if 'SplatBitSize' is a
7094+
// multiple of 'BitWidth'. Otherwise, we could propagate a wrong value.
7095+
if (IsSplat && (SplatBitSize % EltBitWidth) == 0) {
70857096
// Undef bits can contribute to a possible optimisation if set, so
70867097
// set them.
70877098
SplatValue |= SplatUndef;
@@ -7090,23 +7101,9 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
70907101
// the first vector value and FF for the rest, repeating. We need a mask
70917102
// that will apply equally to all members of the vector, so AND all the
70927103
// lanes of the constant together.
7093-
unsigned EltBitWidth = Vector->getValueType(0).getScalarSizeInBits();
7094-
7095-
// If the splat value has been compressed to a bitlength lower
7096-
// than the size of the vector lane, we need to re-expand it to
7097-
// the lane size.
7098-
if (EltBitWidth > SplatBitSize)
7099-
for (SplatValue = SplatValue.zextOrTrunc(EltBitWidth);
7100-
SplatBitSize < EltBitWidth; SplatBitSize = SplatBitSize * 2)
7101-
SplatValue |= SplatValue.shl(SplatBitSize);
7102-
7103-
// Make sure that variable 'Constant' is only set if 'SplatBitSize' is a
7104-
// multiple of 'BitWidth'. Otherwise, we could propagate a wrong value.
7105-
if ((SplatBitSize % EltBitWidth) == 0) {
7106-
Constant = APInt::getAllOnes(EltBitWidth);
7107-
for (unsigned i = 0, n = (SplatBitSize / EltBitWidth); i < n; ++i)
7108-
Constant &= SplatValue.extractBits(EltBitWidth, i * EltBitWidth);
7109-
}
7104+
Constant = APInt::getAllOnes(EltBitWidth);
7105+
for (unsigned i = 0, n = (SplatBitSize / EltBitWidth); i < n; ++i)
7106+
Constant &= SplatValue.extractBits(EltBitWidth, i * EltBitWidth);
71107107
}
71117108
}
71127109

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,8 +161,13 @@ bool ISD::isConstantSplatVector(const SDNode *N, APInt &SplatVal) {
161161
unsigned SplatBitSize;
162162
bool HasUndefs;
163163
unsigned EltSize = N->getValueType(0).getVectorElementType().getSizeInBits();
164+
// Endianness does not matter here. We are checking for a splat given the
165+
// element size of the vector, and if we find such a splat for little endian
166+
// layout, then that should be valid also for big endian (as the full vector
167+
// size is known to be a multiple of the element size).
168+
const bool IsBigEndian = false;
164169
return BV->isConstantSplat(SplatVal, SplatUndef, SplatBitSize, HasUndefs,
165-
EltSize) &&
170+
EltSize, IsBigEndian) &&
166171
EltSize == SplatBitSize;
167172
}
168173

@@ -12357,6 +12362,10 @@ bool BuildVectorSDNode::isConstantSplat(APInt &SplatValue, APInt &SplatUndef,
1235712362

1235812363
// FIXME: This does not work for vectors with elements less than 8 bits.
1235912364
while (VecWidth > 8) {
12365+
// If we can't split in half, stop here.
12366+
if (VecWidth & 1)
12367+
break;
12368+
1236012369
unsigned HalfSize = VecWidth / 2;
1236112370
APInt HighValue = SplatValue.extractBits(HalfSize, HalfSize);
1236212371
APInt LowValue = SplatValue.extractBits(HalfSize, 0);
@@ -12374,6 +12383,12 @@ bool BuildVectorSDNode::isConstantSplat(APInt &SplatValue, APInt &SplatUndef,
1237412383
VecWidth = HalfSize;
1237512384
}
1237612385

12386+
// FIXME: The loop above only tries to split in halves. But if the input
12387+
// vector for example is <3 x i16> it wouldn't be able to detect a
12388+
// SplatBitSize of 16. No idea if that is a design flaw currently limiting
12389+
// optimizations. I guess that back in the days when this helper was created
12390+
// vectors normally was power-of-2 sized.
12391+
1237712392
SplatBitSize = VecWidth;
1237812393
return true;
1237912394
}

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2576,6 +2576,8 @@ performVectorTruncZeroCombine(SDNode *N, TargetLowering::DAGCombinerInfo &DCI) {
25762576
APInt SplatValue, SplatUndef;
25772577
unsigned SplatBitSize;
25782578
bool HasAnyUndefs;
2579+
// Endianness doesn't matter in this context because we are looking for
2580+
// an all-zero value.
25792581
return Splat &&
25802582
Splat->isConstantSplat(SplatValue, SplatUndef, SplatBitSize,
25812583
HasAnyUndefs) &&

0 commit comments

Comments
 (0)