-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm64: Implement VectorTableLookup/VectorTableLookupExtension intrinsinsic + Consecutive registers support #80297
Conversation
Some more cleanup
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsThis adds support for remaining variant of TODO:
Contributes to #1277
|
Here are few examples of C# and corresponding generated code: https://gist.github.com/kunalspathak/73922d385ea2642192c6167fa4a82778 |
…fewer register spilling
Done. |
/azp runtime-coreclr jitstressregs |
Command 'runtime-coreclr' is not supported by Azure Pipelines. Supported commands
See additional documentation. |
/azp run runtime-coreclr jitstressregs |
Azure Pipelines successfully started running 1 pipeline(s). |
This should fix/improve mono support: |
Thanks a lot @vargaz for quick fix. Appreciate it. |
Looks like all the |
It seems there is a extra
Will investigate more. |
This was because we were overwriting the |
@BruceForstall - this should be ready to review again. |
Interesting range of TP diffs, including improvements for non-arm64 code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
failures are known issues. Thanks @BruceForstall for the review. |
This adds support for remaining variant of
VectorTableLookup()
intrinsics that takes tuples fortable
parameter. The tuple can have 2, 3 or 4Vector128<byte>
values. These APIs has a requirement to allocate consecutive registers for the 1st operand value. This PR adds support for that as well.Details
During importer, depending on the form of 1st argument to the
VectorTableLookup()
, create aFIELD_LIST
node with as many fields as number of entries in theValueTuple
of that argument. This node is further decomposed into local var.In register allocation, during
buildInterval()
, we see this intrinsic and build as manyrefPositions
as number of fields in theValueTuple
. We mark each of them with a flagneedsConsecutive
and for the firstrefPosition
, we also set number of consecutive registers the entire series need (again, this is same as number of fields ofValueTuple
in 1st argument). We also save the series in a newly addednextConsecutiveRefPositionMap
map such that we could go to the next refposition in the series easily. (Note thatRefPosition
doesn't have anext
pointer and adding that would have memory cost, hence I chose to add entries in a map).During register assignment, once the "first"
refPosition
gets a register, it sets theregisterAssignment
(Prior to the allocation pass, registerAssignment captures the valid registers for aRefPosition
) of each subsequentRefPosition
s of the series to the corresponding consecutive register(s) that comes after. Register assignment honors that decision (like it does today) and assigns those registers to the subsequent refpositions of that series.Example:
TODO:
Get theDo not expose the API through ref.ValueTuple<>
APIs approvedRn
was assigned and the next refposition should getR0
.Contributes to #1277, #81599