Skip to content

Commit

Permalink
[RISCV] Don't cost vector arithmetic fp ops as cheaper than scalar (l…
Browse files Browse the repository at this point in the history
…lvm#99594)

I was comparing some SPEC CPU 2017 benchmarks across rva22u64 and
rva22u64_v, and noticed that in a few cases that rva22u64_v was
considerably slower.

One of them was 519.lbm_r, which has a large loop that was being
unprofitably vectorized. It has an if/else in the loop which requires
large amounts of predication when vectorized, but despite the loop
vectorizer taking this into account the vector cost came out as cheaper
than the scalar.

It looks like the reason for this is because we cost scalar floating
point ops as 2, but their vector equivalents as 1 (for LMUL 1). This
comes from how we use BasicTTIImpl for scalars which treats floats as
twice as expensive as integers.

This patch doubles the cost of vector floating point arithmetic ops so
that they're at least as expensive as their scalar counterparts, which
gives a 13% speedup on 519.lbm_r at -O3 on the spacemit-x60.

Fixes llvm#62576 (the last point there about scalar fsub/fmul)
  • Loading branch information
lukel97 authored Jul 22, 2024
1 parent 9d2f81e commit 58854fa
Show file tree
Hide file tree
Showing 6 changed files with 484 additions and 347 deletions.
11 changes: 8 additions & 3 deletions llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1688,7 +1688,6 @@ InstructionCost RISCVTTIImpl::getArithmeticInstrCost(
return BaseT::getArithmeticInstrCost(Opcode, Ty, CostKind, Op1Info, Op2Info,
Args, CxtI);


auto getConstantMatCost =
[&](unsigned Operand, TTI::OperandValueInfo OpInfo) -> InstructionCost {
if (OpInfo.isUniform() && TLI->canSplatOperand(Opcode, Operand))
Expand Down Expand Up @@ -1760,8 +1759,14 @@ InstructionCost RISCVTTIImpl::getArithmeticInstrCost(
Op1Info, Op2Info,
Args, CxtI);
}
return ConstantMatCost +
LT.first * getRISCVInstructionCost(Op, LT.second, CostKind);

InstructionCost InstrCost = getRISCVInstructionCost(Op, LT.second, CostKind);
// We use BasicTTIImpl to calculate scalar costs, which assumes floating point
// ops are twice as expensive as integer ops. Do the same for vectors so
// scalar floating point ops aren't cheaper than their vector equivalents.
if (Ty->isFPOrFPVectorTy())
InstrCost *= 2;
return ConstantMatCost + LT.first * InstrCost;
}

// TODO: Deduplicate from TargetTransformInfoImplCRTPBase.
Expand Down
Loading

0 comments on commit 58854fa

Please sign in to comment.