-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Add tests where bin ops of splats could be scalarized. NFC #65747
[RISCV] Add tests where bin ops of splats could be scalarized. NFC #65747
Conversation
This adds tests for fixed and scalable vectors where we have a binary op on two splats that could be scalarized. Normally this would be scalarized in the middle-end by VectorCombine, but as noted in https://reviews.llvm.org/D159190, this pattern can crop up during CodeGen afterwards. Note that a combine already exists for this, but currently it only works on scalable vectors where the element type == XLEN. See llvm#65068 and llvm#65072
; RV64-NEXT: vsetivli zero, 8, e64, m4, ta, ma | ||
; RV64-NEXT: vmv.v.x v8, a0 | ||
; RV64-NEXT: vadd.vx v12, v8, a1 | ||
; RV64-NEXT: vrgather.vi v8, v12, 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's going on here: something's recognising that it's a splat, but it's splatting the result after the vadd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading this right, its trying to move the result of the add v12
into v8
. It's doing this move using vrgather. I think this could have done vadd.vx v8, v8, a1
and removed the need for the vrgather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it should just be vadd.vx v8, v8, a1
, but the weird thing is that the vrgather.vi
is actually splatting the first element of v12
into v8
. Which is weird because the the input IR doesn't say anything about splatting %v
.
So it must somehow know that the result of vadd.vx v12, v8, a1
is a splatted vector, and then it splats the first element again anyway which is a no-op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it must somehow know that the result of vadd.vx v12, v8, a1 is a splatted vector, and then it splats the first element again anyway which is a no-op?
It seems that way.
Will you try and scalarize instructions like div and rem which are more expensive and may need cost modeling? Will you scalarize f16 vectors in the case that there is no f16 scalar, which requires extra instructions to convert to f32 first? If so, it could be nice to have tests for these instructions/types. |
The combine already exists in DAGCombiner, but at the SelectionDAG level it's pretty simple and doesn't do any cost modelling. From a quick scan it does look like it tries to scalarize UDIV/SDIV. It also looks like it doesn't try to scalarize it if there isn't a matching legal scalar operation for the type, e.g. here's the "cost model": bool IsBothSplatVector = N0.getOpcode() == ISD::SPLAT_VECTOR &&
N1.getOpcode() == ISD::SPLAT_VECTOR;
if (!Src0 || !Src1 || Index0 != Index1 ||
Src0.getValueType().getVectorElementType() != EltVT ||
Src1.getValueType().getVectorElementType() != EltVT ||
!(IsBothSplatVector || TLI.isExtractVecEltCheap(VT, Index0)) ||
!TLI.isOperationLegalOrCustom(Opcode, EltVT))
return SDValue(); I agree though, will add those tests to confirm that it doesn't attempt to scalarize them. |
I want to echo @topperc’s concern that if VL is zero for a div then we should not scalarize since that would lead to division by zero in scalar land, but in vector land no division occurs. Can we explicitly test this case? EDIT: Sorry, we may not need to do this because I think VL is never zero when using vectors and not vp intrinsics ? |
Yeah, with regular vector arithmetic there's no concept of the EVL, I double checked anyway and vp nodes don't get run through this combine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This adds tests for fixed and scalable vectors where we have a binary op on two splats that could be scalarized. Normally this would be scalarized in the middle-end by VectorCombine, but as noted in https://reviews.llvm.org/D159190, this pattern can crop up during CodeGen afterwards.
Note that a combine already exists for this, but on RISC-V currently it only works on scalable vectors where the element type == XLEN. See #65068 and #65072