-
-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for targets without native atomics #597
Comments
this is a tricky one. to my understanding, a plain load instruction followed by a compiler fence (this is a compiler hint that does not produce an instruction) is a proper 'atomic load' on single core systems. you only need the atomic fence instruction if you have multiple cores. in that sense, one can emulate atomic loads and stores on a single-core RISCV32I device. I'm not sure, however, one can do this properly on stable for this particular target because you should use LLVM 'monotonic' ordering in the load operation -- I think this is equivalent to passing if you had static ATOMIC: AtomicU8 = AtomicU8::new(0);
// single-core version of `store(Ordering::Release)`
atomic.store(42, Ordering::Relaxed);
sync::compiler_fence(Ordering::Release); the other problem is that there's no conditional compilation option ( I would personally not try to bend defmt to accommodate a target like this until there's better language support for it. In my mind that would mean
or maybe better: just the |
another option that's maybe even less work: add another RISCV32I target to the compiler but enable the $ rustc +nightly -Z unstable-options --print target-spec-json --target wasm32-unknown-unknown
(..)
"singlethread": true, I think with that you should be able to use LLVM's |
tried this. no luck: it produces a libcall (call to an intrinsic) #![feature(intrinsics)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]
#[lang = "copy"]
pub trait Copy {}
impl Copy for u8 {}
#[lang = "sized"]
pub trait Sized {}
#[no_mangle]
pub fn foo(p: *const u8) -> u8 {
unsafe { atomic_load_acq::<u8>(p) }
}
extern "rust-intrinsic" {
pub fn atomic_load_acq<T: Copy>(src: *const T) -> T;
} $ # '// comments' are not actually in the file
$ cat riscv32i-singlecore-none-elf.json
{
"arch": "riscv32",
"atomic-cas": false,
"cpu": "generic-rv32",
"data-layout": "e-m:e-p:32:32-i64:64-n32-S128",
"eh-frame-header": false,
"emit-debug-gdb-scripts": false,
"executables": true,
"linker": "rust-lld",
"linker-flavor": "ld.lld",
"llvm-target": "riscv32",
"max-atomic-width": 0, // doesn't matter in no_core
"panic-strategy": "abort",
"relocation-model": "static",
"singlethread": true, // <-
"target-pointer-width": "32"
}
$ cargo rustc --release --target riscv32i-singlecore-none-elf.json -- --emit=obj
$ rust-objdump -Cd --triple riscv32 target/riscv32i-singlecore-none-elf/release/deps/foo-*.o
00000000 <foo>:
0: 13 01 01 ff addi sp, sp, -16
4: 23 26 11 00 sw ra, 12(sp)
8: 93 05 20 00 addi a1, zero, 2
c: 97 00 00 00 auipc ra, 0
10: e7 80 00 00 jalr ra # <- call to other function
14: 83 20 c1 00 lw ra, 12(sp)
18: 13 01 01 01 addi sp, sp, 16
1c: 67 80 00 00 ret
$ rust-nm -CSn target/riscv32i-singlecore-none-elf/release/deps/foo-*.o
U __atomic_load_1 # <- the other function
00000000 00000020 T foo compare that to the $ cargo rustc --release --target wasm32-unknown-unknown -- --emit=asm
$ cat target/wasm32-unknown-unknown/release/deps/foo-*.s
foo:
.functype foo (i32) -> (i32)
local.get 0
i32.load8_u 0
end_function I would expect |
this route would be adding a impl AtomicU8 {
// available on the riscv32i-*-*-* target
pub fn load_relaxed(&self) -> u8 {
unsafe { intrinsics::atomic_load_relaxed(self.v.get()) }
}
} then third party libraries can implement the (+) this does not codegen what I expect for the #![crate_type = "lib"]
#![feature(intrinsics)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]
#[lang = "copy"]
pub trait Copy {}
impl Copy for u8 {}
#[lang = "sized"]
pub trait Sized {}
#[no_mangle]
pub fn foo(p: *const u8) -> u8 {
unsafe { atomic_load_relaxed::<u8>(p) }
}
extern "rust-intrinsic" {
pub fn atomic_load_relaxed<T: Copy>(src: *const T) -> T;
} $ cargo rustc --release --target riscv32i-unknown-none-elf -- --emit=obj
$ rust-objdump -Cd --triple riscv32 target/riscv32i-unknown-none-elf/release/deps/foo-*.o
00000000 <foo>:
0: 13 01 01 ff addi sp, sp, -16
4: 23 26 11 00 sw ra, 12(sp)
8: 93 05 00 00 mv a1, zero
c: 97 00 00 00 auipc ra, 0
10: e7 80 00 00 jalr ra
14: 83 20 c1 00 lw ra, 12(sp)
18: 13 01 01 01 addi sp, sp, 16
1c: 67 80 00 00 ret
$ rust-nm -CSn target/riscv32i-unknown-none-elf/release/deps/foo-*.o
U __atomic_load_1
00000000 00000020 T foo I would expect the machine code to be the exact same as the $ cargo rustc --release --target riscv32imac-unknown-none-elf -- --emit=obj
$ rust-objdump -Cd --triple riscv32 target/riscv32imac-unknown-none-elf/release/deps/foo-*.o
00000000 <foo>:
0: 03 05 05 00 lb a0, 0(a0)
4: 82 80 ret |
Thanks for looking into this @japaric. I think that this is something that should be something implemented in LLVM, as it is for thumbv6 targets. However this is a long term goal, as I suspect it would not be trivial to get upstreamed. Short term, we are exploring implementing a illegal instruction trap handler and treating the esp32c3 as |
in principle this should be solvable with an implementation of @MabezDev could you check the critical-section crate and close this if it's this problem is solvable upstream (in critical-section)? |
On the RP2040 we have a hardware spin-lock subsystem, which is what |
Sorry this issue fell off my radar, I forgot to come back and mention I implemented the atomic emulation trap handler: https://github.com/esp-rs/riscv-atomic-emulation-trap. The switch to Thanks for investigating this! |
293: unconditionally depend on atomic-polyfill for riscv32i target r=japaric a=japaric due to a limitation in the llvm backend [1] the `riscv32i-unknown-none-elf` target lacks the `core::sync::atomic` API even though the actual hardware is capable of atomic loads and stores (`fence` instruction). thus, at this point in time, this target needs `atomic-polyfill` for a working SPSC implementation [1]: knurling-rs/defmt#597 (comment) fixes #271 closes #272 closes #273 Co-authored-by: Jorge Aparicio <jorge.aparicio@ferrous-systems.com>
293: unconditionally depend on atomic-polyfill for riscv32i target r=japaric a=japaric due to a limitation in the llvm backend [1] the `riscv32i-unknown-none-elf` target lacks the `core::sync::atomic` API even though the actual hardware is capable of atomic loads and stores (`fence` instruction). thus, at this point in time, this target needs `atomic-polyfill` for a working SPSC implementation [1]: knurling-rs/defmt#597 (comment) fixes #271 closes #272 closes #273 Co-authored-by: Jorge Aparicio <jorge.aparicio@ferrous-systems.com>
293: unconditionally depend on atomic-polyfill for riscv32i target r=japaric a=japaric due to a limitation in the llvm backend [1] the `riscv32i-unknown-none-elf` target lacks the `core::sync::atomic` API even though the actual hardware is capable of atomic loads and stores (`fence` instruction). thus, at this point in time, this target needs `atomic-polyfill` for a working SPSC implementation [1]: knurling-rs/defmt#597 (comment) fixes #271 closes #272 closes #273 Co-authored-by: Jorge Aparicio <jorge.aparicio@ferrous-systems.com>
Hi, "RISCV user who doesn't want to use the trap handler approach" here :P I noticed that |
Good question @TheButlah. As far as I see we still have 4 atomic variables. The
|
We decided to pause this change until we do a coordinated effort to get risc-v support for |
Perhaps I'm overlooking something - my understanding of the ecosystem is quite limited. But why is this issue related to (or more importantly, blocked on) RISC-V support at all?
In my case, the My rationale for wanting to use the Could #702 please be reopened and considered for merge? It seems to be a minor change, has no drawbacks for the currently supported targets, is unrelated to RISC-V, but expands support to targets without atomics. Apologies if I sound gruff, and its entirely possible I've just misunderstood something :) |
@TheButlah said:
Quoting @japaric s answer to the question:
|
Great, thanks for clarifying! I appreciate it :) |
Would two years of LLVM development have changed the playing ground? In particular, LLVM 16 release notes mention:
Having three |
Assuming that neither old compilers nor custom targets (that created based on old info) are used, the underlying problem should already be fixed in both RISC-V (rust-lang/rust#114499) and AVR (rust-lang/rust#114495). As for the MSP430, it has not been fixed on the LLVM side, but the portable-atomic provides what is needed. |
thanks @taiki-e The problem I experienced was discussed over the weekend on https://matrix.to/#/#esp-rs:matrix.org and seems to be related to |
The
esp32c3
is a riscv based cpu, without the atomic extension. Unlike the armthumbv6
target, it does not have atomic load/stores for the following typesu8
,u16
,u32
, which means we cannot usedefmt
(and other rtt based crates) as it stands.I recently landed a PR in
atomic-polyfill
that adds load/store for those types mentioned above, implemented using critical sections. I have a PR inrtt-target
to use that, but it has not yet been accepted.In theory for better performance, we could ditch the critical section and implement our own atomic operation that use fences to ensure atomicity (this is what is done in llvm for
thumbv6
) - however this can cause UB when mixed with lock based CAS atomics (see: here, here, and here ) therefore is not suitable to add toatomic-polyfill
as it stands.I would like the esp32c3 to be able to support the c3, and I am sure there are other riscv, non-a targets too.
What do you think the best idea is going forward?
The text was updated successfully, but these errors were encountered: