-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm64/SVE: Add support to handle predicate registers as callee-trash #104065
Changes from all commits
b23e311
a15808b
570583f
a4b687c
2a175f9
c19234e
08f3748
1e69fd3
bcfd8a8
bb97d80
5535c69
775db15
a9b64a8
0d91f29
387ceef
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -75,8 +75,15 @@ | |||
#define RBM_FLT_CALLEE_SAVED (RBM_V8|RBM_V9|RBM_V10|RBM_V11|RBM_V12|RBM_V13|RBM_V14|RBM_V15) | ||||
#define RBM_FLT_CALLEE_TRASH (RBM_V0|RBM_V1|RBM_V2|RBM_V3|RBM_V4|RBM_V5|RBM_V6|RBM_V7|RBM_V16|RBM_V17|RBM_V18|RBM_V19|RBM_V20|RBM_V21|RBM_V22|RBM_V23|RBM_V24|RBM_V25|RBM_V26|RBM_V27|RBM_V28|RBM_V29|RBM_V30|RBM_V31) | ||||
|
||||
#define RBM_LOWMASK (RBM_P0|RBM_P1|RBM_P2|RBM_P3|RBM_P4|RBM_P5|RBM_P6|RBM_P7) | ||||
#define RBM_HIGHMASK (RBM_P8|RBM_P9|RBM_P10| RBM_P11|RBM_P12|RBM_P13|RBM_P14|RBM_P15) | ||||
#define RBM_ALLMASK (RBM_LOWMASK|RBM_HIGHMASK) | ||||
|
||||
#define RBM_MSK_CALLEE_SAVED (0) | ||||
#define RBM_MSK_CALLEE_TRASH RBM_ALLMASK | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. somewhere I should just zero it out if we are not running on SVE machine. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the TP cost coming from the additional killed registers? I assume that's because we don't have a predicate registers equivalent of I wonder if you could just add a case for predicate registers here: runtime/src/coreclr/jit/lsrabuild.cpp Line 3094 in 55f2bc6
And then during allocation, mask out the predicate registers when processing kills if no predicate registers were used. We would still be creating additional RegRecords though, but maybe this helps a bit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Actually I guess we were creating those There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am already doing it in https://github.com/dotnet/runtime/pull/104065/files#diff-ad66a6bcf1fd550d5ad10d995c03218afbbc39463d36e1f2a224f9ca070a2f99R858-R860. Predicate registers exist only in presence of floating point usage. Yes, we do the newly added extra predicate registers in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
although note that when we altjit, we say that "sve capability enable", so we will see predicate registers and will process them during kills. The TP information will be misleading for those cases, but I will add this anyway so that on non-sve arm64 machine, we do not process them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the use of predicate registers is going to be much more rare than using float registers, hence adding this extra check would help regardless.
I don't see a good reason to try optimizing for non-SVE machines. In the future we would expect most arm64 machines to be SVE enabled, right? I think we should rather optimize for the common case of "predicate registers not used". It should be possible now that we are only creating one There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes
Agree. I will do a separate pass for it. #104157 to track it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||
|
||||
#define RBM_CALLEE_SAVED (RBM_INT_CALLEE_SAVED | RBM_FLT_CALLEE_SAVED) | ||||
#define RBM_CALLEE_TRASH (RBM_INT_CALLEE_TRASH | RBM_FLT_CALLEE_TRASH) | ||||
#define RBM_CALLEE_TRASH (RBM_INT_CALLEE_TRASH | RBM_FLT_CALLEE_TRASH | RBM_MSK_CALLEE_TRASH) | ||||
|
||||
#define REG_DEFAULT_HELPER_CALL_TARGET REG_R12 | ||||
#define RBM_DEFAULT_HELPER_CALL_TARGET RBM_R12 | ||||
|
@@ -146,14 +153,6 @@ | |||
#define REG_JUMP_THUNK_PARAM REG_R12 | ||||
#define RBM_JUMP_THUNK_PARAM RBM_R12 | ||||
|
||||
#define RBM_LOWMASK (RBM_P0 | RBM_P1 | RBM_P2 | RBM_P3 | RBM_P4 | RBM_P5 | RBM_P6 | RBM_P7) | ||||
#define RBM_HIGHMASK (RBM_P8 | RBM_P9 | RBM_P10 | RBM_P11 | RBM_P12 | RBM_P13 | RBM_P14 | RBM_P15) | ||||
#define RBM_ALLMASK (RBM_LOWMASK | RBM_HIGHMASK) | ||||
|
||||
// TODO-SVE: Fix when adding predicate register allocation | ||||
#define RBM_MSK_CALLEE_SAVED (0) | ||||
#define RBM_MSK_CALLEE_TRASH (0) | ||||
|
||||
// ARM64 write barrier ABI (see vm\arm64\asmhelpers.asm, vm\arm64\asmhelpers.S): | ||||
// CORINFO_HELP_ASSIGN_REF (JIT_WriteBarrier), CORINFO_HELP_CHECKED_ASSIGN_REF (JIT_CheckedWriteBarrier): | ||||
// On entry: | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, I wonder if this code (for SVE vectors) should be refactored call out to an emit_R_R_I function instead of falling into the non-sve code below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.