-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64: Optimize pair of "str reg, [fp]" to stp #35134
Labels
arch-arm64
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone
Comments
kunalspathak
added
arch-arm64
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
labels
Apr 17, 2020
Dotnet-GitSync-Bot
added
the
untriaged
New issue has not been triaged by the area owner
label
Apr 17, 2020
BruceForstall
removed
the
untriaged
New issue has not been triaged by the area owner
label
Apr 20, 2020
29 tasks
8 tasks
AndyJGraham
added a commit
to AndyJGraham/runtime
that referenced
this issue
Oct 27, 2022
This change serves to address the following four Github tickets: 1. ARM64: Optimize pair of "ldr reg, [fp]" to ldp dotnet#35130 2. ARM64: Optimize pair of "ldr reg, [reg]" to ldp dotnet#35132 3. ARM64: Optimize pair of "str reg, [reg]" to stp dotnet#35133 4. ARM64: Optimize pair of "str reg, [fp]" to stp dotnet#35134 A technique was employed that involved detecting an optimisation opportunity as instruction sequences were being generated. The optimised instruction was then generated on top of the previous instruction, with no second instruction generated. Thus, there were no changes to instruction group size at “emission time” and no changes to jump instructions.
BruceForstall
pushed a commit
to BruceForstall/runtime
that referenced
this issue
Jan 12, 2023
This change serves to address the following four Github tickets: 1. ARM64: Optimize pair of "ldr reg, [fp]" to ldp dotnet#35130 2. ARM64: Optimize pair of "ldr reg, [reg]" to ldp dotnet#35132 3. ARM64: Optimize pair of "str reg, [reg]" to stp dotnet#35133 4. ARM64: Optimize pair of "str reg, [fp]" to stp dotnet#35134 A technique was employed that involved detecting an optimisation opportunity as instruction sequences were being generated. The optimised instruction was then generated on top of the previous instruction, with no second instruction generated. Thus, there were no changes to instruction group size at “emission time” and no changes to jump instructions.
kunalspathak
pushed a commit
that referenced
this issue
Jan 27, 2023
…77540) * Replace successive "ldr" and "str" instructions with "ldp" and "stp" This change serves to address the following four Github tickets: 1. ARM64: Optimize pair of "ldr reg, [fp]" to ldp #35130 2. ARM64: Optimize pair of "ldr reg, [reg]" to ldp #35132 3. ARM64: Optimize pair of "str reg, [reg]" to stp #35133 4. ARM64: Optimize pair of "str reg, [fp]" to stp #35134 A technique was employed that involved detecting an optimisation opportunity as instruction sequences were being generated. The optimised instruction was then generated on top of the previous instruction, with no second instruction generated. Thus, there were no changes to instruction group size at “emission time” and no changes to jump instructions. * No longer use a temporary buffer to build the optimized instruction. * Addressed assorted review comments. * Now optimizes ascending locations and decending locations with consecutive STR and LDR instructions. * Modification to remove last instructions. * Ongoing improvements to remove previously-emitted instruction during ldr / str optimization. * Stopped optimization of consecutive instructions that straddled an instruction group boundary. * Addressed code change requests in GitHub. * Various fixes to ldp/stp optimization Add code to update IP mappings when an instruction is removed. * Delete unnecessary and incorrect assert * Diagnostic change only, to confirm whether a theory is correct or not when chasing an error. * Revert "Diagnostic change only, to confirm whether a theory is correct or" This reverts commit 4b0e51e. * Do not merge. Temporarily removed calls to "codeGen->genIPmappingUpdateForRemovedInstruction()". Also, corrected minor bug in instruction numbering when removing instructions during optimization. * Modifications to better update the IP mapping table for a replaced instruction. * Minor formatting change. * Check for out of range offsets * Don't optimise during prolog/epilog * Fix windows build error * IGF_HAS_REMOVED_INSTR is ARM64 only * Add OptimizeLdrStr function * Fix formatting * Ensure local variables are tracked * Don't peephole local variables Co-authored-by: Bruce Forstall <brucefo@microsoft.com> Co-authored-by: Alan Hayward <alan.hayward@arm.com> Co-authored-by: Alan Hayward <a74nh@users.noreply.github.com>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
arch-arm64
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
can be combined into
stp
if the stores are happening in subsequent memory.I collected no. of such
str
pairs in framework libraries and found approx. 33K pairs in 16K methods.Details:
str_str_fp_to_stp.txt
category:cq
theme:optimization
skill-level:intermediate
cost:small
impact:medium
The text was updated successfully, but these errors were encountered: