Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Provide intrinsics necessary to get atomics working on the thumbv6m-none-eabi target #114

Closed
japaric opened this issue Oct 28, 2016 · 15 comments

Comments

@japaric
Copy link
Member

japaric commented Oct 28, 2016

The built-in thumbv6m-none-eabi has max_atomic_width set to 0 because LLVM doesn't know how to lower atomic operations to actual instructions instead it lowers atomic operations to intrinsics like __sync_fetch_and_add_4. The result is that core doesn't expose the Atomic* structs so the alloc crate and any other crate that depends on it can't be compiled for this target.

I propose we implement those intrinsics in this crate (libcompiler-rt.a provides these intrinsics on other architectures) and then change the definition of the thumbv6m target (max_atomic_width = 32) to provide atomics in core; that way alloc, collections and other crates would become compilable for this target.

This is the (incomplete) list of intrinsics that would need to be implemented:

  • __sync_fetch_and_add_4
  • __sync_lock_test_and_set_4

Their implementation would likely use locking by temporarily disabling the interrupts.

The alternative is to do the change in the target definition without implement the intrinsics. This pushes the task of implementing the intrinsics to the downstream users.

cc @alexcrichton @Amanieu @thejpster @whitequark

@japaric
Copy link
Member Author

japaric commented Oct 29, 2016

Their implementation would likely use locking by temporarily disabling the interrupts.

There's a problem with this implementation. This would always work with Cortex-M0 devices but Cortex-M0+ processors have this concept of privileged vs unprivileged execution mode. The instruction that disables/enables interrupts doesn't work when the processor is in unprivileged mode. That means that atomics based on enabling/disabling interrupts won't work (won't be actually atomic) if the processor is in unprivileged mode.

cc @whitequark

@alexcrichton
Copy link
Member

The libs team decided awhile back that when adding the various atomic types that if the platform didn't have support for them the standard library wouldn't export them. That is, we always felt pretty uncomfortable falling back to compiler-rt intrinsics to implement critical operations like atomics which can have serious effects on runtime functionality.

That being said, however, if you do want the fallbacks as you've verified they work for you, then we don't have a great story for that. We perhaps definitely need to improve there!

@Amanieu
Copy link
Member

Amanieu commented Oct 29, 2016

@alexcrichton My understanding of the lib team decision was that falling back to compiler-rt intrinsics was bad because those don't necessarily guarantee lock freedom. This means that they can result in incorrect behavior if mixed with signals for example.

However if compiler-rt can provide atomic operations while still guaranteeing lock freedom then IMO this should be acceptable. One specific case I have in mind is atomic support for pre-ARMv6 Linux, where the kernel provides a __kuser_cmpxchg function which can be called at address 0xffff0fc0.

@japaric
Copy link
Member Author

japaric commented Oct 29, 2016

where the kernel provides a __kuser_cmpxchg function which can be called at address 0xffff0fc0.

Reference


For others targets that need a lock based solution, it seems that we could use a lang item. Perhaps add a "user" option to max_atomic_width and have the compiler request a lang item for each intrinsic that LLVM needs. Ideally, the compiler should only ask for the intrinsics that will actually be used in the target programs and not all of them at once.

This certainly needs a proper RFC.

@Amanieu
Copy link
Member

Amanieu commented Oct 29, 2016

I don't think that lock-based solutions should be provided by the standard library. I have a crate which provides a generic Atomic<T> type which falls back to a spinlock if the target doesn't support the necessary atomic operations for the given object size.

@japaric
Copy link
Member Author

japaric commented Oct 29, 2016

we could use a lang item

Or this could be a stopgap perma-unstable solution until we come up with a better one. It depends on how much demand there is for this.

An alternative for the actual problem at hand, people want to cross compile alloc / collections for the thumbv6m target, is to add an option to the alloc crate to not depend on any atomic; that would imply getting replacing the current OOM handling code with something else (I don't know with what tough)

@thejpster
Copy link

thejpster commented Oct 29, 2016

Yes, what I want to do is compile liballoc for Cortex-M0 (a Qualcomm/NXP/Freescale Kinetis KE06Z).

@Amanieu
Copy link
Member

Amanieu commented Oct 29, 2016

@japaric It's not just the OOM handling code, liballoc also contains the implementation of Arc.

@japaric
Copy link
Member Author

japaric commented Oct 29, 2016

@Amanieu yeah, Arc can cfg-ed away too. People mostly to use the alloc interface to get Box and Vec working on this thumbv6m target.

@thejpster
Copy link

I was curious, so I did a quick check with arm-none-eabi-gcc 5.2 and the _Atomic int type.

#include <stdatomic.h>
_Atomic int g_lock = 0;
void test_function(void)
{
    g_lock++;
}

With a Cortex-M4 target, gcc emits ldrex/strex instructions. With a Cortex-M0 target it emits a call to __atomic_compare_exchange_4. The cortex-m0plus and cortex-m0 targets produce identical code in this case. Interestingly if I try arm-linux-gnu-gcc it refuses to compile for M0, M0+ or M1, but does for M3 and M4. Perhaps because it can't turn interrupts off in user code.

@alexcrichton
Copy link
Member

@japaric my preference here would be to just add the appropriate cfg to the liballoc crate for now to get it to compile out Arc if atomics don't exist. The standard library wouldn't work yet but we could in theory update it eventually.

@japaric
Copy link
Member Author

japaric commented Oct 31, 2016

@alexcrichton Alright. Sent rust-lang/rust#37492 implementing this approach. alloc and collections can be compiled for the thumbv6m with that change.

@alexcrichton
Copy link
Member

This is quite an old bug at this point and I believe it's been sorted out, so closing.

@aykevl
Copy link

aykevl commented May 3, 2019

See https://reviews.llvm.org/D61052 for a related discussion.

@axos88
Copy link

axos88 commented Aug 2, 2019

Digging this one up. I also need Arc on a Cortex-M0 microcontroller, to implement basic functionality above FreeRTOS.
What if we'd make the Arc implementation generic over AtomicUSize that would default to core::sync::AtomicUSize on platforms that implement it, but would alloc a user to create Arc<T, MyAtomicUSize> if he can provide a way to safely implement the necessary operations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants