Support for targets without native atomics #597

MabezDev · 2021-09-28T14:14:34Z

The esp32c3 is a riscv based cpu, without the atomic extension. Unlike the arm thumbv6 target, it does not have atomic load/stores for the following types u8, u16, u32, which means we cannot use defmt (and other rtt based crates) as it stands.

I recently landed a PR in atomic-polyfill that adds load/store for those types mentioned above, implemented using critical sections. I have a PR in rtt-target to use that, but it has not yet been accepted.

In theory for better performance, we could ditch the critical section and implement our own atomic operation that use fences to ensure atomicity (this is what is done in llvm for thumbv6) - however this can cause UB when mixed with lock based CAS atomics (see: here, here, and here ) therefore is not suitable to add to atomic-polyfill as it stands.

I would like the esp32c3 to be able to support the c3, and I am sure there are other riscv, non-a targets too.

What do you think the best idea is going forward?

The text was updated successfully, but these errors were encountered:

japaric · 2021-09-29T12:46:53Z

this is a tricky one.

to my understanding, a plain load instruction followed by a compiler fence (this is a compiler hint that does not produce an instruction) is a proper 'atomic load' on single core systems. you only need the atomic fence instruction if you have multiple cores.

in that sense, one can emulate atomic loads and stores on a single-core RISCV32I device. I'm not sure, however, one can do this properly on stable for this particular target because you should use LLVM 'monotonic' ordering in the load operation -- I think this is equivalent to passing Ordering::Relaxed to AtomicU8::load

if you had AtomicU8 + Ordering::Relaxed for this target you could use this code wherever you need the other orderings.

static ATOMIC: AtomicU8 = AtomicU8::new(0);

// single-core version of `store(Ordering::Release)`
atomic.store(42, Ordering::Relaxed);
sync::compiler_fence(Ordering::Release);

the other problem is that there's no conditional compilation option (cfg) one can use to indicate that some code is being compiled for a single-core system thus you cannot combine the above snippet with some other unsafe code (to make a SPSC queue) and call the whole thing safe because it will NOT be safe on multi-core devices and there's no mechanism to make the code not compiler for multi-core devices.

I would personally not try to bend defmt to accommodate a target like this until there's better language support for it. In my mind that would mean

AtomicU8::load(_, Ordering::Relaxed) somehow exists for the target
there's some cfg mechanism to distinguish between single-core and multi-core systems

or maybe better: just the cfg mechanism added to the language and then AtomicU8 can be implemented in core for a single-core only version of the RISCV32I target

japaric · 2021-09-29T12:52:30Z

another option that's maybe even less work:

add another RISCV32I target to the compiler but enable the singlethread feature in its target spec. the wasm32-unknown-unknown target has that feature enabled:

$ rustc +nightly -Z unstable-options --print target-spec-json --target wasm32-unknown-unknown
(..)
  "singlethread": true,

I think with that you should be able to use LLVM's atomic_load_acquire intrinsics, etc. in the libcore implementation. LLVM should NOT emit an atomic fence instruction when singlethread is enabled (iirc)

japaric · 2021-09-29T13:34:55Z

#597 (comment)

tried this. no luck: it produces a libcall (call to an intrinsic)

#![feature(intrinsics)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]

#[lang = "copy"]
pub trait Copy {}

impl Copy for u8 {}

#[lang = "sized"]
pub trait Sized {}

#[no_mangle]
pub fn foo(p: *const u8) -> u8 {
    unsafe { atomic_load_acq::<u8>(p) }
}

extern "rust-intrinsic" {
    pub fn atomic_load_acq<T: Copy>(src: *const T) -> T;
}

$ # '// comments' are not actually in the file
$ cat riscv32i-singlecore-none-elf.json
{
  "arch": "riscv32",
  "atomic-cas": false,
  "cpu": "generic-rv32",
  "data-layout": "e-m:e-p:32:32-i64:64-n32-S128",
  "eh-frame-header": false,
  "emit-debug-gdb-scripts": false,
  "executables": true,
  "linker": "rust-lld",
  "linker-flavor": "ld.lld",
  "llvm-target": "riscv32",
  "max-atomic-width": 0, // doesn't matter in no_core
  "panic-strategy": "abort",
  "relocation-model": "static",
  "singlethread": true, // <-
  "target-pointer-width": "32"
}

$ cargo rustc --release --target riscv32i-singlecore-none-elf.json -- --emit=obj

$ rust-objdump -Cd --triple riscv32 target/riscv32i-singlecore-none-elf/release/deps/foo-*.o
00000000 <foo>:
       0: 13 01 01 ff   addi    sp, sp, -16
       4: 23 26 11 00   sw      ra, 12(sp)
       8: 93 05 20 00   addi    a1, zero, 2
       c: 97 00 00 00   auipc   ra, 0
      10: e7 80 00 00   jalr    ra # <- call to other function
      14: 83 20 c1 00   lw      ra, 12(sp)
      18: 13 01 01 01   addi    sp, sp, 16
      1c: 67 80 00 00   ret

$ rust-nm -CSn  target/riscv32i-singlecore-none-elf/release/deps/foo-*.o
                  U __atomic_load_1 # <- the other function
00000000 00000020 T foo

compare that to the wasm target. no atomic fence instruction

$ cargo rustc --release --target wasm32-unknown-unknown -- --emit=asm

$ cat target/wasm32-unknown-unknown/release/deps/foo-*.s
foo:
        .functype       foo (i32) -> (i32)
        local.get       0
        i32.load8_u     0
        end_function

I would expect singlethread: true + llvm_target: riscv32 to produce something like the wasm code: a load and no atomic fence instruction

japaric · 2021-09-29T13:58:43Z

to my understanding, a plain load instruction followed by a compiler fence (this is a compiler hint that does not produce an instruction) is a proper 'atomic load' on single core systems.

this route would be adding a load_relaxed API to the Atomic* types -- this is RFC material. The implementation would call the atomic_load_relaxed intrinsic (+)

impl AtomicU8 {
    // available on the riscv32i-*-*-* target
    pub fn load_relaxed(&self) -> u8 {
        unsafe { intrinsics::atomic_load_relaxed(self.v.get()) }
    }
}

then third party libraries can implement the load_relaxed + sync::compiler_fence version of load(Acquire) I mentioned above -- again, that implementation would only be sound on single core systems. how one would enforce that: I don't know

(+) this does not codegen what I expect for the riscv32i target

#![crate_type = "lib"]
#![feature(intrinsics)]
#![feature(lang_items)]
#![feature(no_core)]
#![no_core]

#[lang = "copy"]
pub trait Copy {}

impl Copy for u8 {}

#[lang = "sized"]
pub trait Sized {}

#[no_mangle]
pub fn foo(p: *const u8) -> u8 {
    unsafe { atomic_load_relaxed::<u8>(p) }
}

extern "rust-intrinsic" {
    pub fn atomic_load_relaxed<T: Copy>(src: *const T) -> T;
}

$ cargo rustc --release --target riscv32i-unknown-none-elf -- --emit=obj

$ rust-objdump -Cd --triple riscv32 target/riscv32i-unknown-none-elf/release/deps/foo-*.o
00000000 <foo>:
       0: 13 01 01 ff   addi    sp, sp, -16
       4: 23 26 11 00   sw      ra, 12(sp)
       8: 93 05 00 00   mv      a1, zero
       c: 97 00 00 00   auipc   ra, 0
      10: e7 80 00 00   jalr    ra
      14: 83 20 c1 00   lw      ra, 12(sp)
      18: 13 01 01 01   addi    sp, sp, 16
      1c: 67 80 00 00   ret

$ rust-nm -CSn target/riscv32i-unknown-none-elf/release/deps/foo-*.o
                  U __atomic_load_1
00000000 00000020 T foo

I would expect the machine code to be the exact same as the riscv32imac target in this load_relaxed case (no fence instruction)

$ cargo rustc --release --target riscv32imac-unknown-none-elf -- --emit=obj

$ rust-objdump -Cd --triple riscv32 target/riscv32imac-unknown-none-elf/release/deps/foo-*.o
00000000 <foo>:
       0: 03 05 05 00   lb      a0, 0(a0)
       4: 82 80         ret

MabezDev · 2021-10-05T14:31:03Z

Thanks for looking into this @japaric.

I think that this is something that should be something implemented in LLVM, as it is for thumbv6 targets. However this is a long term goal, as I suspect it would not be trivial to get upstreamed.

Short term, we are exploring implementing a illegal instruction trap handler and treating the esp32c3 as imac. This way has an advantage of not 'dirtying' upstream crates like this one. Obviously this will have a performance impact, but hopefully is not too bad.

japaric · 2022-03-24T14:52:53Z

in principle this should be solvable with an implementation of critical-section for the relevant RISC-V target or devices. I'm not familiar with the details but defmt-rtt which requires a critical section now works on the RP2040 which lacks the HW instructions required to implement a CAS loop and it's a multi-core device.

@MabezDev could you check the critical-section crate and close this if it's this problem is solvable upstream (in critical-section)?

jonathanpallant · 2022-03-24T15:02:16Z

On the RP2040 we have a hardware spin-lock subsystem, which is what critical-section uses. IIRC there's two unsafe C FFI functions you have to implement and you can do so any way you like.

MabezDev · 2022-03-24T15:12:23Z

Sorry this issue fell off my radar, I forgot to come back and mention I implemented the atomic emulation trap handler: https://github.com/esp-rs/riscv-atomic-emulation-trap. The switch to critical-section should close this for any RISCV users who don't want to use the trap handler approach.

Thanks for investigating this!

293: unconditionally depend on atomic-polyfill for riscv32i target r=japaric a=japaric due to a limitation in the llvm backend [1] the `riscv32i-unknown-none-elf` target lacks the `core::sync::atomic` API even though the actual hardware is capable of atomic loads and stores (`fence` instruction). thus, at this point in time, this target needs `atomic-polyfill` for a working SPSC implementation [1]: knurling-rs/defmt#597 (comment) fixes #271 closes #272 closes #273 Co-authored-by: Jorge Aparicio <jorge.aparicio@ferrous-systems.com>

TheButlah · 2022-10-04T01:26:39Z

Hi, "RISCV user who doesn't want to use the trap handler approach" here :P
I was using the ESP32-C3 (RISC-V) and need to try implementing my code without the atomic emulation trap handler (due to this unrelated issue).

I noticed that defmt-rtt uses core::sync::atomic still, making it unusable for people without atomic emulation. Wouldn't that mean that this issue should be repoened?

Urhengulas · 2022-10-04T10:27:36Z

Good question @TheButlah.

As far as I see we still have 4 atomic variables. The static TAKEN: AtomicBool in lib.rs and 3 AtomicUsize as part of struct Channel in channel.rs.

In my understanding we should be able to replace all of them with their non-atomic counterpart, because we only access them while holding the critical section. But I'd like a second opinion on that before going forward with that.

Urhengulas · 2022-10-06T14:07:26Z

We decided to pause this change until we do a coordinated effort to get risc-v support for defmt and probe-run. All the relevant issues will be tracked with the tag topic: risc-v.

TheButlah · 2022-10-08T23:16:11Z

We decided to pause this change until we do a coordinated effort to get risc-v support for defmt and probe-run. All the relevant issues will be tracked with the tag topic: risc-v.

Perhaps I'm overlooking something - my understanding of the ecosystem is quite limited. But why is this issue related to (or more importantly, blocked on) RISC-V support at all?

defmt-rtt already works on the riscv32imac-unknown-none-elf target (note the a). This issue is not about RISC-V support but rather supporting targets that don't have native atomics.

In my case, the riscv32imc-unknown-none-elf (note the absence of a) doesn't work on defmt-rtt, but I'm sure there are other targets that also don't support atomics.

My rationale for wanting to use the -imc target instead of the -imac target is because -imc is more performant.

Could #702 please be reopened and considered for merge? It seems to be a minor change, has no drawbacks for the currently supported targets, is unrelated to RISC-V, but expands support to targets without atomics. defmt-rtt is one of the core libraries in my firmware and I would love to be able to use it without being forced into the -imac target

Apologies if I sound gruff, and its entirely possible I've just misunderstood something :)

Urhengulas · 2022-10-17T09:36:53Z

@TheButlah said:

Perhaps I'm overlooking something - my understanding of the ecosystem is quite limited. But why is this issue related to (or more importantly, blocked on) RISC-V support at all?

defmt-rtt already works on the riscv32imac-unknown-none-elf target (note the a). This issue is not about RISC-V support but rather supporting targets that don't have native atomics.

In my case, the riscv32imc-unknown-none-elf (note the absence of a) doesn't work on defmt-rtt, but I'm sure there are other targets that also don't support atomics.

My rationale for wanting to use the -imc target instead of the -imac target is because -imc is more performant.

Could #702 please be reopened and considered for merge? It seems to be a minor change, has no drawbacks for the currently supported targets, is unrelated to RISC-V, but expands support to targets without atomics. defmt-rtt is one of the core libraries in my firmware and I would love to be able to use it without being forced into the -imac target

Apologies if I sound gruff, and its entirely possible I've just misunderstood something :)

Quoting @japaric s answer to the question:

RISCV+IMC, the architecture variant not the rustc compilation target, does support atomic loads and stores; it does not support CAS operations. In terms of atomics support, that RISCV architecture variant is more or less equivalent to ARMv6-M, which defmt-rtt does support. So, defmt-rtt already supports "targets that don't support atomics" because it does not require CAS operations; it only requires atomic loads and stores.

The difference between the riscv32imc-* and the thumbv6m-* rustc compilation targets is that the RISCV one has broken LLVM codegen and that's why it doesn't have the Atomic* API available at all.

TheButlah · 2022-10-17T17:06:05Z

Great, thanks for clarifying! I appreciate it :)

lure23 · 2024-08-01T06:40:34Z

Would two years of LLVM development have changed the playing ground?

In particular, LLVM 16 release notes mention:

A target feature was introduced to force-enable atomics.

Having three esp32c3 devkits, in 2024, this issue is still alive at least for me.

taiki-e · 2024-08-05T18:51:00Z

Assuming that neither old compilers nor custom targets (that created based on old info) are used, the underlying problem should already be fixed in both RISC-V (rust-lang/rust#114499) and AVR (rust-lang/rust#114495). As for the MSP430, it has not been fixed on the LLVM side, but the portable-atomic provides what is needed.

lure23 · 2024-08-05T18:54:48Z

thanks @taiki-e

The problem I experienced was discussed over the weekend on https://matrix.to/#/#esp-rs:matrix.org and seems to be related to probe-rs 0.24.0.

japaric added the status: blocked Blocked on another issue, or on upstream libraries label Sep 29, 2021

MabezDev closed this as completed Mar 24, 2022

japaric mentioned this issue May 12, 2022

unconditionally depend on atomic-polyfill for riscv32i target rust-embedded/heapless#293

Merged

Urhengulas reopened this Oct 4, 2022

Urhengulas mentioned this issue Oct 4, 2022

Non atomic RTT #699

Closed

TheButlah mentioned this issue Oct 4, 2022

IllegalInstruction exception when using embassy SlimeVR/SlimeVR-Rust#37

Closed

Urhengulas mentioned this issue Oct 5, 2022

Atomic polyfill #702

Closed

Urhengulas added status: needs decision The Knurling team needs to make a decision on this topic: risc-v priority: medium Medium priority for the Knurling team labels Oct 6, 2022

TheButlah mentioned this issue Feb 25, 2023

Embassy HAL implementation for Espressif chips embassy-rs/embassy#745

Closed

6 tasks

akauppi mentioned this issue Jul 31, 2024

Avoid using ESP32-C3-DevKitC-02 - for now akauppi/Embassy-playground-2024#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for targets without native atomics #597

Support for targets without native atomics #597

MabezDev commented Sep 28, 2021

japaric commented Sep 29, 2021

japaric commented Sep 29, 2021

japaric commented Sep 29, 2021

japaric commented Sep 29, 2021

MabezDev commented Oct 5, 2021

japaric commented Mar 24, 2022

jonathanpallant commented Mar 24, 2022 •

edited

Loading

MabezDev commented Mar 24, 2022

TheButlah commented Oct 4, 2022 •

edited

Loading

Urhengulas commented Oct 4, 2022 •

edited

Loading

Urhengulas commented Oct 6, 2022

TheButlah commented Oct 8, 2022 •

edited

Loading

Urhengulas commented Oct 17, 2022

TheButlah commented Oct 17, 2022

lure23 commented Aug 1, 2024

taiki-e commented Aug 5, 2024 •

edited

Loading

lure23 commented Aug 5, 2024

Support for targets without native atomics #597

Support for targets without native atomics #597

Comments

MabezDev commented Sep 28, 2021

japaric commented Sep 29, 2021

japaric commented Sep 29, 2021

japaric commented Sep 29, 2021

japaric commented Sep 29, 2021

MabezDev commented Oct 5, 2021

japaric commented Mar 24, 2022

jonathanpallant commented Mar 24, 2022 • edited Loading

MabezDev commented Mar 24, 2022

TheButlah commented Oct 4, 2022 • edited Loading

Urhengulas commented Oct 4, 2022 • edited Loading

Urhengulas commented Oct 6, 2022

TheButlah commented Oct 8, 2022 • edited Loading

Urhengulas commented Oct 17, 2022

TheButlah commented Oct 17, 2022

lure23 commented Aug 1, 2024

taiki-e commented Aug 5, 2024 • edited Loading

lure23 commented Aug 5, 2024

jonathanpallant commented Mar 24, 2022 •

edited

Loading

TheButlah commented Oct 4, 2022 •

edited

Loading

Urhengulas commented Oct 4, 2022 •

edited

Loading

TheButlah commented Oct 8, 2022 •

edited

Loading

taiki-e commented Aug 5, 2024 •

edited

Loading