-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[DAG] Constant Folding for U/SMUL_LOHI #69437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I did this after looking at #69324, where we emitted a couple of muls that were obviously foldable. Again no sure how effective this is, i just thought it's a nice to have. We have one more instruction in the testcase but it's because we scalarized a bit more and needed to copy to a VGPR, so overall I think it improved it. |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
| ; VI-NEXT: v_addc_u32_e32 v7, vcc, v5, v4, vcc | ||
| ; VI-NEXT: v_mul_lo_u32 v4, v7, s6 | ||
| ; VI-NEXT: s_mov_b32 s4, 0x346d900 | ||
| ; VI-NEXT: s_add_u32 s4, 0x4237, s4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a plan to fold this add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, it looks like this:
t420: v2i32 = BUILD_VECTOR Constant:i32<16951>, Constant:i32<0>
t421: i64 = bitcast t420
t414: v2i32 = BUILD_VECTOR Constant:i32<54974720>, Constant:i32<0>
t415: i64 = bitcast t414
t190: i64 = add t421, t415
We can't do a generic combine for the bitcast I think. I tried it and it caused loops. I think the easiest would be to make a DAG equivalent of the GISel helper that looks through splats and just returns the value directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to replace dyn_cast<ConstantSDNode> checks in SelectionDAG::FoldConstantArithmetic with ComputeKnownBits calls that looks for isConstant scalar values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what the issue is here. Why can't we fold (bitcast (build_vector constant:i32, constant:i32)) -> constant:i64 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a ConstantFoldBITCASTofBUILD_VECTOR fold but it only runs before LegalizeTypes/Ops. Some targets need the bitcasts apparently. I tried doing it more often but only in specific cases (<=64 bit ints) but I still get infinite loops everywhere.
I think we need to add some helper to get splat values instead of checking ConstantSDNode when constant folding add. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it just be a matter of legality checks to avoid the loops?
No description provided.