-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Consume FMA intrinsic operands in right order #102914
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth noting the reason we didn't do this in lowering (unlike most of the other cases that do swap operands) is because it would require introducing a lot of new synthetic intrinsics and it was believed to be overall more costly.
Each
FusedMultiplyAdd
intrinsic has three forms, whereop3
is always the node that can be optionally contained:132
-op1 = (op1 * op3) + op2
213
- `op1 = (op2 * op1) + op3231
-op1 = (op2 * op3) + op1
The managed API we expose is the
213
form and there are 10 different FMA intrinsics, so we'd need to expose 10 more for the132
and 10 more for the231
form. Then we'd need to repeat this for theAvx512
specific variants, giving us at least 50 new synthetic intrinsics in lowering just to coverFMA
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to avoid new synthetic intrinsics if we had a way to track what permutation it was, but free bits are fairly sparse right now. So I think we'd need to get clever in how we tracked that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot use
genConsumeMultiOpOperands()
here instead?Edit: I assume because we won't be using the same order as swapped operands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right,
op1, op2, op3
(the order thatgenConsumeMultiOpOperands
consumes in) is not the same asemitOp1, emitOp2, emitOp3
, which as I understand it is the order that uses were built in by LSRA. We should consume in that order.I don't have the context necessary to completely understand why we can't build and consume the operands in the
op1, op2, op3
order even if we end up emitting different instructions using the registers in different orders in the instruction we emit.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tannergooding - do you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I don't. It's an area of the register allocator I'm not well versed in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hhm, I think it is to do with how we are consuming them in codegen as opposed to the LSRA ordering.