-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support AVR for inline asm! #91224
Support AVR for inline asm! #91224
Conversation
r? @wesleywiser (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Regarding register classes, just make sure you have enough to accurate describe the constraints that would be needed for various instructions. See https://www.nongnu.org/avr-libc/user-manual/inline_asm.html for a mostly complete list of constraint code used by avr-gcc and LLVM. For example, I can see some instructions which only accept a "high" register (but then again I'm not very familiar with AVR assembly).
compiler/rustc_target/src/asm/avr.rs
Outdated
r31: reg = ["r31", "ZH"], | ||
|
||
X: reg_xyz = ["r27r26", "X"], | ||
Y: reg_xyz = ["r29r28", "Y"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a look at lib/Target/AVR/AVRRegisterInfo.cpp
in the LLVM source you will see that Y
is actually reserved for the stack pointer. Also r0
and r1
are reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like for Y
the way to go is adding the #error
pragma here to keep the user's hands off of it.
I think we'd need some way to name r0
and r1
. For instance, to get the result of a MUL
:
let a = 1u8;
let b = 1u8;
let res: u16;
asm!(
"mul {}, {}",
in(reg) a,
in(reg) b,
lateout("r1r0") res,
);
Is there another architecture that I can look to for an example of how to handle this?
(and just to clarify because it confused me for a moment, Y
is reserved for the frame pointer, the stack pointer has a dedicated register SPH:SPL
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd need some way to name r0 and r1.
Or just leave it to the user to reassemble the value from r0
and r1
themselves. In the end it should be optimized the same way by the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wasn't clear. If r0
and r1
are reserved, and thus probably shouldn't be nameable for inline asm!
, how is the user expected to get the results from them?
Or to put it another way, what should I do with the information that lib/Target/AVR/AVRRegisterInfo.cpp
reserves these registers if not exclude them from use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wasn't clear. If r0 and r1 are reserved, and thus probably shouldn't be nameable for inline asm!, how is the user expected to get the results from them?
Both registers r1
and r0
should be accessible. Any author of some piece Rust code targeting AVR should be able to name these registers in inline ASM.
The zero register should always be accessible even in inline asm so that the dev can pass 0
to an instruction which only supports registers as inputs, not immediates.
See this avr-gcc wiki doc https://gcc.gnu.org/wiki/avr-gcc#Fixed_Registers which shows that both registers should be available in inline asm.
It is a little bit finicky with LLVM instruction reordering, but it's always like that and I don't see a problem with your approach. You won't be able to say, run a piece of C code (i.e. something that executes MUL
) and then use inline ASM to reassemble r1
/r0
because LLVM inline assembler does not parse/inspect into the raw assembly text (only the constraints/list), so LLVM won't know you're referring to r1/r0 and so will not be able to preserve it during transformation. You would be able to use inline ASM to reassemble r1/r0 from the result of an instruction that is also inside the inline asm block, because the internal contents of any inline asm block would not be reordered. In summary, in this paragraph all I am trying to say that it is perfectly acceptable and allowable to use r1
/r0
in inline ASM, but you cannot rely on their values across inline asm block boundaries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like there should be a register class just for r1
/r0
, they shouldn't be in reg
because we wouldn't want to pick either of them for "any old register".
The zero register should always be accessible
Thanks for bringing this up. I'll need to add to the documentation that if a programmer writes to r1
in any way that they must restore it to zero before the end of the asm!
block.
it is perfectly acceptable and allowable to use r1/r0 in inline ASM, but you cannot rely on their values across inline asm block boundaries.
Does that mean that you shouldn't be able to use them as inputs or outputs to the block? For instance, would my code sample up thread be expected to work, or does that count as "across block boundaries"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that r0
and r1
should be usable as operands in asm!
. However the reality is that LLVM marks these as reserved registers, which causes LLVM's register allocator to ignore these and assume they are never touched by inline assembly. If you try to actually use r0
/r1
as operands then things will break in unpredictable ways, which is why we always prevent reserved registers from being used in asm!
.
The only "correct" way to use r0
and r1
in asm!
is to save their value to another register (which should be properly marked as clobbered) and then restoring these values at the end of the assembly snippet. This is unavoidable until the LLVM backend is fixed to no longer treat these registers as reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am having a hard time understanding exactly what LLVM means by reserved. The only authority I've found is the comment above getReservedRegs
here: https://github.com/llvm/llvm-project/blob/e356027016c6365b3d8924f54c33e2c63d931492/llvm/include/llvm/CodeGen/TargetRegisterInfo.h#L524-L527
/// A reserved register:
/// - is not allocatable
/// - is considered always live
/// - is ignored by liveness tracking
I don't think I know enough to tell if that should be applied to r0
and r1
or not. I'm inclined to trust your judgement there @Amanieu.
This is unavoidable until the LLVM backend is fixed to no longer treat these registers as reserved.
If this is where we're going, I'm happy to submit this change to LLVM. @dylanmckay, since I suspect you'll be reviewing the differential, does that change make sense to you?
In the meantime, I'll make r0
and r1
error out here with a helpful message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unavoidable until the LLVM backend is fixed to no longer treat these registers as reserved.
If this is where we're going, I'm happy to submit this change to LLVM. @dylanmckay, since I suspect you'll be reviewing the differential, does that change make sense to you?
From my point of view, if AVR-GCC compatibility is kept, then the change is definitely good. If not, maybe it's possible but certainly more complicated 👍
r? @Amanieu |
Thanks for your review @Amanieu! I've replied in the places where I have questions. Is there any style guidance on the translation from constraint codes/descriptions to register class names? Here is my proposal for a minimal set of constraint codes to support:
The first two could possibly be named |
ca56851
to
babde3c
Compare
I've pushed up a variety of changes that addresses most (but not all) of the outstanding items. Please take a look when you have availability @Amanieu. Thanks! |
I think I'm not entirely clear on how I should be handling the register pairs. Any pair can be used as an argument to the 16-bit MOVW instruction, and the upper pairs can be an argument to the 16-bit immediate instructions like ADIW. Also, each of the pointer pairs, which have special behavior for the indirect memory operations, can be used as two general-purpose 8-bit registers when passed to any other instruction. The 8-bit and 16-bit operation expect the lower register of the pair to be named, with the upper register implicit. Are the changes I've made here consistent with that usage? |
One point I'll make is that the instruction set is pretty ad-hoc/bolted on parts in ways, with even the registers supported for an "instruction" being target MCU specific. You will see instructions like: Another instruction complicating it is Suffice to say, I think it's best to land a core set like you're doing and add the others later. It's probably worth adding 3 more extra register classes in this PR ( EDIT: I remembered you're intentionally not allowing frame pointer
|
Thanks for the review @dylanmckay! The point is well received that every AVR device is a new beast. For myself, I'm usually working on ones that don't even have a hardware
Are all the use-cases for these handled by the specific-register syntax of let mut byte = MaybeUninit::uninit();
let input = 42u8;
asm!(
"st X, {}",
in("X") byte.as_mut_ptr() as u16,
in(reg) input,
); or should you also be able to say let mut byte = MaybeUninit::uninit();
let input = 42u8;
asm!(
"st {}, {}",
in(x) byte.as_mut_ptr() as u16,
in(reg) input,
); |
I would recommend adding template modifiers for register pairs to print just the high/low register of a pair. Have a look at how it's done for ARM where the |
That's a great idea. I'll go ahead and add those. |
e3ef486
to
1e4e1c3
Compare
Thank you @Amanieu and @dylanmckay for your continued reviews and many thoughtful comments. I have incorporated all of the feedback thus far and rebased this branch. Please take another look when you have an opportunity. |
@@ -322,6 +322,7 @@ impl AsmBuilderMethods<'tcx> for Builder<'a, 'll, 'tcx> { | |||
InlineAsmArch::SpirV => {} | |||
InlineAsmArch::Wasm32 | InlineAsmArch::Wasm64 => {} | |||
InlineAsmArch::Bpf => {} | |||
InlineAsmArch::Avr => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to look this up in LLVM's internals, but the clobber for the status register is "~{sreg}"
, which should be marked as clobbered when preserves_flags
is not set. Also the documentation in the unstable book should be updated to reflect this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added that.
So many of the other platforms have an empty branch here it was easy to not spend much thought on this clause. But the realization that I missed this made me worried there might be more. I just went through again and checked for other copy-pasta, didn't notice any concerns.
The one other place we have a null implementation that could potentially have code is suggest_modifier
-- that one I did consider if there was a heuristic to use to know which byte you mean. But in the end it seems equally likely you'd want either byte, and there doesn't seem to be a way to suggest two modifiers.
1e4e1c3
to
c6e8ae1
Compare
@bors r+ rollup=never |
📌 Commit c6e8ae1 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (0b6f079): comparison url. Summary: This change led to moderate relevant regressions 😿 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression |
I have no idea what needs to be done to investigate this performance issue. From the linked results I can see there are five benchmark/profile/scenario rows with issues, and four of them are the externs benchmark. But I don't know what to do next. What are these externs benchmarks? Why would inline assembler changes for a platform that presumably isn't exercised by the benchmark make such a significant change? What's the usual range on these results? What does "sufficient written justification" require? Are there any examples or documentation that explain what's needed? |
The performance issue is almost certainly not related to this PR. |
Good to hear. Is there anything actionable for me here, then? |
@couchand no, I don't think there's anything actionable for you here. If anything, the performance team might want to double check what's going on with that benchmark, and/or our noise suppression techniques. |
I think we can ignore this regression. @rustbot label: +perf-regression-triaged |
A first pass at support for the AVR platform in inline
asm!
. Passes the initial compiler tests, have not yet done more complete verification.In particular, the register classes could use a lot more fleshing out, this draft PR so far only includes the most basic.
cc @Amanieu @dylanmckay