Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv-rt: compatibility with RV32E and RV64E #243

Merged
merged 8 commits into from
Dec 12, 2024
Merged

Conversation

romancardenas
Copy link
Contributor

Avoids using x16+ registers in assembly code. This is necessary to provide compatibility with RV32E.
Note that there is still work to do. Namely, the trap frame must adapt to the target base instruction set.

Solves #189

Related: #239

@romancardenas romancardenas requested a review from a team as a code owner November 9, 2024 09:19
@romancardenas romancardenas changed the title Compatibility with RV32E riscv-rt: partial compatibility with RV32E Nov 9, 2024
Comment on lines 573 to 604
/// `x1`: return address, stores the address to return to after a function call or interrupt.
pub ra: usize,
/// `x5`: temporary register `t0`, used for intermediate values.
pub t0: usize,
/// `x6`: temporary register `t1`, used for intermediate values.
pub t1: usize,
/// `x7`: temporary register `t2`, used for intermediate values.
pub t2: usize,
/// `x28`: temporary register `t3`, used for intermediate values.
pub t3: usize,
/// `x29`: temporary register `t4`, used for intermediate values.
pub t4: usize,
/// `x30`: temporary register `t5`, used for intermediate values.
pub t5: usize,
/// `x31`: temporary register `t6`, used for intermediate values.
pub t6: usize,
/// `x10`: argument register `a0`. Used to pass the first argument to a function.
pub a0: usize,
/// `x11`: argument register `a1`. Used to pass the second argument to a function.
pub a1: usize,
/// `x12`: argument register `a2`. Used to pass the third argument to a function.
pub a2: usize,
/// `x13`: argument register `a3`. Used to pass the fourth argument to a function.
pub a3: usize,
/// `x14`: argument register `a4`. Used to pass the fifth argument to a function.
pub a4: usize,
/// `x15`: argument register `a5`. Used to pass the sixth argument to a function.
pub a5: usize,
/// `x16`: argument register `a6`. Used to pass the seventh argument to a function.
pub a6: usize,
/// `x17`: argument register `a7`. Used to pass the eighth argument to a function.
pub a7: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a good idea to order these fields by there x* name (similar to the spec)?

Also, maybe we use coniditional compilation to not include the x16-x31 registers for RV32E builds?

Mostly thinking about downstream users that may use a pointer to the TrapFrame, which they may or may not try to map to memory layout.

It may not be necessary, but could be a helpful reminder that those registers don't exist on RV32E platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to avoid changing the order of the attributes in case users use C or assembly code that relies on the current order.

I'm working on feature-gating registers for RVE32. It's a bit tricky, as we need to add "trash" padding to ensure that the stack is 16-byte aligned. I'll open a PR shortly with all this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to avoid changing the order of the attributes in case users use C or assembly code that relies on the current order.

I guess that's also part of what I'm asking, "should we intentionally break downstream users compiling RV32E?" Especially regarding not providing them in-memory registers that don't exist on the platform.

For non-RV32E users, I agree with you, we should avoid breaking them without a good reason.

I'm working on feature-gating registers for RVE32. It's a bit tricky, as we need to add "trash" padding to ensure that the stack is 16-byte aligned. I'll open a PR shortly with all this.

Sounds good, I'll keep an eye out for it.

Maybe for the missing registers on RV32E targets:

pub struct TrapFrame {
    #[cfg(not(rv32e))]
    pub a6: usize,
    #[cfg(rv32e)]
    _a6: usize,
    ...
}

Copy link

@hegza hegza Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on feature-gating registers for RVE32. It's a bit tricky, as we need to add "trash" padding to ensure that the stack is 16-byte aligned. I'll open a PR shortly with all this.

Wait, 16-byte aligned? You mean for RV32E? Stack on RV32E is specified to be 32-bit / 4-byte aligned. It's a specific feature of RV32E and in contrast to RV32I. References in LLVM & Clang:

@romancardenas
Copy link
Contributor Author

I've been working on a complete riscv-target-parser crate to help us in the build scripts. Also, I think it is time to stop using the riscv, riscv32, and riscv64 cfg flags and just use target_arch.

My main challenge now is configuring riscv-rt-macros according to flags triggered in riscv-rt. I don't know how to do this without features, but I will read elsewhere and see if there is any other approach. I don't love letting users activate a riscve feature regardless of the target.

@rmsyn
Copy link
Contributor

rmsyn commented Nov 19, 2024

My main challenge now is configuring riscv-rt-macros according to flags triggered in riscv-rt. I don't know how to do this without features

If this specifically about RV32E stuff, maybe a combination of target_arch and target_feature flags in a #[cfg(...)]? Although, it looks like there are some restrictions on using target_feature on riscv* targets.

I would be happy to help out, if you have some work-in-progress code you would like extra eyes on.

@romancardenas
Copy link
Contributor Author

This is the best way I could think of. I tried forwarding environment variables from build scripts to no avail, as procedural macros are compiled separately from regular crates. The macro code can be squeezed a bit, but more or less this is it.

rmsyn
rmsyn previously approved these changes Nov 22, 2024
Copy link
Contributor

@rmsyn rmsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the best way I could think of. I tried forwarding environment variables from build scripts to no avail, as procedural macros are compiled separately from regular crates. The macro code can be squeezed a bit, but more or less this is it.

Yeah, I tried using conditional compilation with target_feature = "i" + target_feature = "e", but it doesn't look like base ISA are available.

If target_feature = "i" and target_feature = "e" were available, we could use conditional compilation to define a core_interrupt_inner the four variants we care about, i.e. riscv32e, riscv32i, riscv64e, riscv64i. Then we could have the single core_interrupt macro function wrapping the unsafe inner functions.

I can look at the rustc/cargo source, and see what it takes to add base ISA as conditional compilation attribute. It may make more sense for the base ISA stuff to go under the instruction_set attribute (currently only has ARM in the docs).

The changes here LGTM.

Edit: something like this is what I was thinking: rv32e-compat-asm...rmsyn:riscv:rv32e-compat-asm

@romancardenas
Copy link
Contributor Author

I tried something like that, but I found out that the procedural macros are compiled for the host, not the target. Thus, the cfg(target_arch = "riscv32") does not work unless your host machine is a RV32

@rmsyn
Copy link
Contributor

rmsyn commented Nov 23, 2024

I tried something like that, but I found out that the procedural macros are compiled for the host, not the target. Thus, the cfg(target_arch = "riscv32") does not work unless your host machine is a RV32

What about replacing them with the cfg! macro, which is evaluated at run-time instead of build-time?

Something like: rmsyn/riscv@rust-embedded:riscv:rv32e-compat-asm...rv32e-compat-asm

@romancardenas
Copy link
Contributor Author

I'm currently with a considerable workload at work. Will come back to this this weekend, probably

@rmsyn
Copy link
Contributor

rmsyn commented Nov 26, 2024

I'm currently with a considerable workload at work. Will come back to this this weekend, probably

No worries, I hope all goes well.

@romancardenas
Copy link
Contributor Author

@rmsyn take a look to my latest commit. Using cfg!(target =...) didn't work, but I managed to do something very similar with environment variables.

Comment on lines 603 to 606
#[cfg(all(target_arch = "riscv32", riscve))]
_reserved0: usize,
#[cfg(all(target_arch = "riscv32", riscve))]
_reserved1: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these reserved fields added for alignment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the trap frame must be 16-byte aligned. Thus, for RV32E, we need a couple of usize reserved fields

rmsyn
rmsyn previously approved these changes Nov 28, 2024
Copy link
Contributor

@rmsyn rmsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, congrats on the env solution!

@romancardenas romancardenas changed the title riscv-rt: partial compatibility with RV32E riscv-rt: compatibility with RV32E and RV64E Nov 28, 2024
@romancardenas romancardenas force-pushed the rv32e-compat-asm branch 2 times, most recently from 3c48d56 to 62b041c Compare November 28, 2024 15:13
@romancardenas
Copy link
Contributor Author

I added a few minor improvements and wrote all the important changes in the CHANGELOG.md file

rmsyn
rmsyn previously approved these changes Dec 3, 2024
@romancardenas
Copy link
Contributor Author

@hegza can you confirm that this version fits your RV32E projects?

@hegza
Copy link

hegza commented Dec 4, 2024

I maintain a short RVE patch-set that works for us, here: master...hegza:riscv:feat/rve

I can smoke test our platform possibly today already.

@romancardenas
Copy link
Contributor Author

Thanks for your review! I fixed the alignment issue. Now, it should be basically identical to your fork.

@hegza
Copy link

hegza commented Dec 4, 2024

Tested RT by swapping it in place of our fork for a minimal smoke test and indeed there was no smoke. So, works on my machine (rv32emc), I guess 😅 Compiles alright and prints over serial without putting the core into crazy states.

Notably, no traps were taken in the test case. It takes slightly more involved setup.

Comment on lines 476 to 481
let trap_size = arch.trap_frame().len();
// ensure we do not break that sp is 16-byte aligned
if (TRAP_SIZE * width) % 16 != 0 {
if (trap_size * width) % 16 != 0 {
return parse::Error::new(Span::call_site(), "Trap frame size must be 16-byte aligned")
.to_compile_error()
.into();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RV32E requires 4-byte trap alignment instead of 16-byte, compare to my implementation:

master...hegza:riscv:feat/rve#diff-dd85e4f259537e87b681ea17211538f2f6280a0b186b8e3fe7d333a6f937a7c2R414-R427

@hegza
Copy link

hegza commented Dec 5, 2024

For the record, if the RT manages to create any invalid stores to registers x16.. on an RV32E Rust backend, it would fail to compile. So, the fact that it compiles is already a good sign. TrapFrame looked alright as well.

@romancardenas
Copy link
Contributor Author

So I guess it is ready to merge, then!

@romancardenas romancardenas requested a review from rmsyn December 10, 2024 10:18
@romancardenas
Copy link
Contributor Author

I just noticed that we were breaking the 16 byte alignment in our runtime. I guess we could add a bunch of feature gates to relax these constraints in RVxE targets, but I'd rather leave this work for a future PR

@rmsyn
Copy link
Contributor

rmsyn commented Dec 12, 2024

I guess we could add a bunch of feature gates to relax these constraints in RVxE targets, but I'd rather leave this work for a future PR

As far as I can tell, there is no difference for the stack pointer alignment in RVxE implementations, the only difference is the register count (based on Ch. 3 of the Unprivileged Spec). Is there somewhere else that specifies a difference in stack pointer alignment?

@romancardenas romancardenas added this pull request to the merge queue Dec 12, 2024
Merged via the queue into master with commit 3cd758b Dec 12, 2024
119 checks passed
@rmsyn
Copy link
Contributor

rmsyn commented Dec 13, 2024

I haven't found this specified either but that's what clang and LLVM are doing, as linked in earlier discussion already:

* https://github.com/llvm/llvm-project/blob/528bcf3a55ca520c31c77ed5fbacf09bff8f39ec/clang/lib/Basic/Targets/RISCV.h#L145-L150

* https://github.com/llvm/llvm-project/blob/5b17f85a7d722439a39f1ac1c554aed7858adab4/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp#L145-L146

* https://github.com/llvm/llvm-project/blob/5b17f85a7d722439a39f1ac1c554aed7858adab4/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp#L30-L36

We also discussed this earlier over at rust-lang/rust: rust-lang/rust#130555 (comment)

Looks like GCC does the same thing: https://github.com/gcc-mirror/gcc/blob/d136fa00f0d5faff8397edcd7e4ebb3445ab21b0/gcc/config/riscv/riscv.h#L202-L205

I wonder where they get this information from? Since they both do it, they must be basing it on something. I'm looking into some soft-core implementations like CVW and PicoRV32, since those look like the closest thing to an actual hardware implementation at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants