Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Vector binary ops of splats aren't scalarized on fixed vectors #65068

Open
lukel97 opened this issue Aug 29, 2023 · 3 comments
Open

[RISCV] Vector binary ops of splats aren't scalarized on fixed vectors #65068

lukel97 opened this issue Aug 29, 2023 · 3 comments

Comments

@lukel97
Copy link
Contributor

lukel97 commented Aug 29, 2023

The functions below multiply a vector with a splatted scalar, computed from an add:

define <4 x i64> @f_v4i64(<4 x i64> %x, i64 %y) {
  %1 = insertelement <4 x i64> poison, i64 %y, i32 0
  %2 = shufflevector <4 x i64> %1, <4 x i64> poison, <4 x i32> zeroinitializer
  %3 = add <4 x i64> %2, <i64 3, i64 3, i64 3, i64 3>
  %4 = mul <4 x i64> %x, %3
  ret <4 x i64> %4
}

define <vscale x 4 x i64> @f_nxv4i64(<vscale x 4 x i64> %x, i64 %y) {
  %1 = insertelement <vscale x 4 x i64> poison, i64 %y, i32 0
  %2 = shufflevector <vscale x 4 x i64> %1, <vscale x 4 x i64> poison, <vscale x 4 x i32> zeroinitializer
  %3 = add <vscale x 4 x i64> %2, shufflevector(<vscale x 4 x i64> insertelement(<vscale x 4 x i64> poison, i64 3, i32 0), <vscale x 4 x i64> poison, <vscale x 4 x i32> zeroinitializer)
  %4 = mul <vscale x 4 x i64> %x, %3
  ret <vscale x 4 x i64> %4
}

When compiled with llc -o - -mattr=+v, the splatted add is scalarized for the scalable version:

f_nxv4i64:
	addi	a0, a0, 3
	vsetvli	a1, zero, e64, m4, ta, ma
	vmul.vx	v8, v8, a0
	ret

But the fixed version doesn't:

f_v4i64:
	vsetivli	zero, 4, e64, m2, ta, ma
	vmv.v.x	v10, a0
	vadd.vi	v10, v10, 3
	vmul.vv	v8, v8, v10
	ret
@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2023

@llvm/issue-subscribers-backend-risc-v

@ChunyuLiao
Copy link
Member

candidate patch : https://reviews.llvm.org/D159190

@EugeneZelenko EugeneZelenko removed llvm:codegen llvm:SelectionDAG SelectionDAGISel as well labels Aug 30, 2023
@ChunyuLiao
Copy link
Member

candidate patch : https://reviews.llvm.org/D159190
This patch is not a good solution.

Looking forward to a more generalized solution.

@ChunyuLiao ChunyuLiao added llvm:codegen llvm:SelectionDAG SelectionDAGISel as well labels Sep 8, 2023
lukel97 added a commit to lukel97/llvm-project that referenced this issue Sep 8, 2023
This adds tests for fixed and scalable vectors where we have a binary op on two
splats that could be scalarized. Normally this would be scalarized in the
middle-end by VectorCombine, but as noted in https://reviews.llvm.org/D159190,
this pattern can crop up during CodeGen afterwards.

Note that a combine already exists for this, but currently it only works on
scalable vectors where the element type == XLEN. See llvm#65068 and llvm#65072
lukel97 added a commit to lukel97/llvm-project that referenced this issue Sep 13, 2023
When scalarizing bin ops of splats, we treat the extract as free for a
splat_vector because its operand is already scalar, i.e.

(extract idx, (splat_vector x)) -> x.

The same also applies for a build_vector that's a splat:

(extract idx, (build_vector x x x x)) -> x.

This patch takes this into account, which enables scalarization for fixed
length vectors, since the current canonical form for a splatted fixed length
vector is still build_vector.

This improves what we were seeing on RISC-V in llvm#65068, but unfortunately causes
some patterns to be missed on other targets.  One big one is that on AArch64
and X86 scalarizing (xor (splat x), (splat -1)) to (splat (xor x, -1)) prevents
vnot from being matched, which for example prevents bif from being matched.

Posting this patch as a WIP to show my findings.
lukel97 added a commit that referenced this issue Sep 20, 2023
…65747)

This adds tests for fixed and scalable vectors where we have a binary op
on two splats that could be scalarized. Normally this would be
scalarized in the middle-end by VectorCombine, but as noted in
https://reviews.llvm.org/D159190, this pattern can crop up during
CodeGen afterwards.

Note that a combine already exists for this, but on RISC-V currently it
only works on scalable vectors where the element type == XLEN. See
#65068 and #65072
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants