Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/rmutex: use atomic utils #16919

Merged
merged 2 commits into from
Feb 4, 2022
Merged

core/rmutex: use atomic utils #16919

merged 2 commits into from
Feb 4, 2022

Conversation

maribu
Copy link
Member

@maribu maribu commented Sep 30, 2021

Contribution description

Replace use of C11 atomics with atomic utils. This fixes

error: address argument to atomic operation must be a pointer to a
       trivially-copyable type ('_Atomic(int) *' invalid)

error when compiling on AVR with LLVM.

Testing procedure

tests/rmutex should still pass

Issues/PRs references

None

@github-actions github-actions bot added the Area: core Area: RIOT kernel. Handle PRs marked with this with care! label Sep 30, 2021
@maribu
Copy link
Member Author

maribu commented Sep 30, 2021

On my Nucleo-F767ZI:

$ make BOARD=nucleo-f767zi -C tests/rmutex flash test

[...]

Help: Press s to start test, r to print it is ready
READY
s
START
main(): This is RIOT! (Version: 2021.10-devel-790-g7313a-core/rmutex)
Recursive Mutex test
Please refer to the README.md for more information

T2 (prio 6, depth 0): trying to lock rmutex now
T3 (prio 4, depth 0): trying to lock rmutex now
T4 (prio 5, depth 0): trying to lock rmutex now
T5 (prio 2, depth 0): trying to lock rmutex now
T6 (prio 4, depth 0): trying to lock rmutex now
main: unlocking recursive mutex
T5 (prio 2, depth 0): locked rmutex now
T5 (prio 2, depth 1): trying to lock rmutex now
T5 (prio 2, depth 1): locked rmutex now
T5 (prio 2, depth 2): trying to lock rmutex now
T5 (prio 2, depth 2): locked rmutex now
T5 (prio 2, depth 3): trying to lock rmutex now
T5 (prio 2, depth 3): locked rmutex now
T5 (prio 2, depth 3): unlocked rmutex
T5 (prio 2, depth 2): unlocked rmutex
T5 (prio 2, depth 1): unlocked rmutex
T5 (prio 2, depth 0): unlocked rmutex
T3 (prio 4, depth 0): locked rmutex now
T3 (prio 4, depth 1): trying to lock rmutex now
T3 (prio 4, depth 1): locked rmutex now
T3 (prio 4, depth 2): trying to lock rmutex now
T3 (prio 4, depth 2): locked rmutex now
T3 (prio 4, depth 2): unlocked rmutex
T3 (prio 4, depth 1): unlocked rmutex
T3 (prio 4, depth 0): unlocked rmutex
T6 (prio 4, depth 0): locked rmutex now
T6 (prio 4, depth 1): trying to lock rmutex now
T6 (prio 4, depth 1): locked rmutex now
T6 (prio 4, depth 2): trying to lock rmutex now
T6 (prio 4, depth 2): locked rmutex now
T6 (prio 4, depth 3): trying to lock rmutex now
T6 (prio 4, depth 3): locked rmutex now
T6 (prio 4, depth 4): trying to lock rmutex now
T6 (prio 4, depth 4): locked rmutex now
T6 (prio 4, depth 4): unlocked rmutex
T6 (prio 4, depth 3): unlocked rmutex
T6 (prio 4, depth 2): unlocked rmutex
T6 (prio 4, depth 1): unlocked rmutex
T6 (prio 4, depth 0): unlocked rmutex
T4 (prio 5, depth 0): locked rmutex now
T4 (prio 5, depth 1): trying to lock rmutex now
T4 (prio 5, depth 1): locked rmutex now
T4 (prio 5, depth 2): trying to lock rmutex now
T4 (prio 5, depth 2): locked rmutex now
T4 (prio 5, depth 2): unlocked rmutex
T4 (prio 5, depth 1): unlocked rmutex
T4 (prio 5, depth 0): unlocked rmutex
T2 (prio 6, depth 0): locked rmutex now
T2 (prio 6, depth 1): trying to lock rmutex now
T2 (prio 6, depth 1): locked rmutex now
T2 (prio 6, depth 2): trying to lock rmutex now
T2 (prio 6, depth 2): locked rmutex now
T2 (prio 6, depth 3): trying to lock rmutex now
T2 (prio 6, depth 3): locked rmutex now
T2 (prio 6, depth 4): trying to lock rmutex now
T2 (prio 6, depth 4): locked rmutex now

$ echo $?
0

@maribu
Copy link
Member Author

maribu commented Sep 30, 2021

And also:

$ make BOARD=samr21-xpro BUILD_IN_DOCKER=1 -C tests/rmutex

master:

   text	   data	    bss	    dec	    hex	filename
  10748	    132	  10056	  20936	   51c8	/data/riotbuild/riotbase/tests/rmutex/bin/samr21-xpro/tests_rmutex.elf

This PR:

   text	   data	    bss	    dec	    hex	filename
  10792	    132	  10056	  20980	   51f4	/data/riotbuild/riotbase/tests/rmutex/bin/samr21-xpro/tests_rmutex.elf

So this increases ROM consumption. The reason is that previously relaxed memory order was used, atomic utils always use sequentially consistent memory accesses. I think this is actually fixing a race condition in the code, as otherwise the compiler could load the owner of the rmutex before checking if it is already locked.

@chrysn chrysn added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Sep 30, 2021
@chrysn
Copy link
Member

chrysn commented Sep 30, 2021

While I'm waiting for the tools I'll need to run the test: Do you know if that test ever passed? Were mutexes ever usable on AVR?

On the more general topic: Are we moving off c11 atomics towards using the helper everywhere? (It'd make my life a lot easier, given that atomics are a pain to transpile right now, but that's off topic here).

How does seqcst vs. relaxed impact the code size by 40 bytes? (Played around with godbolt; "relaxed" is 0 and "seq_cst", is it just that in may code locations?)

@kaspar030
Copy link
Contributor

On the more general topic: Are we moving off c11 atomics towards using the helper everywhere? (It'd make my life a lot easier, given that atomics are a pain to transpile right now, but that's off topic here).

Yeah. Also, is that a good idea, are C11 atomics that broken? Just to not open another NIH issue here. Are there pointers to reasoning?

@kaspar030
Copy link
Contributor

Yeah. Also, is that a good idea, are C11 atomics that broken? Just to not open another NIH issue here. Are there pointers to reasoning?

Don't get me wrong, I'm for this. I think the reasons in #14331 are sufficient. We should write something somewhere stating that in-tree, atomic_utils should be preferred and why.

@@ -24,11 +24,6 @@
#define RMUTEX_H

#include <stdint.h>
#ifdef __cplusplus
#include "c11_atomics_compat.hpp"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cross language issues with C11 atomics are not unique to rust, btw. C++ also has its own atomics, which are inherently incompatible. The c11_atomics_compat.hpp could be dropped if we would move away from C11 atomics in-tree.

But I guess I should try to improve the ROM costs of the atomic utils first. There is one issue with GCC not optimizing unused variables across memory barriers - which LLVM just does if the variable could be (at least in theory) be allocated in registers, which LLVM reasons is not memory and thus not affected by memory barriers. I won't get past this issue, but GCC's optimizer is generally freaking out when it sees memory barriers or volatile and stops optimizing things that are not affected. (E.g. I often saw that pointer arithmetic done for every access, despite the memory the pointer points to was marked as volatile, not the pointer itself.) Making atomic utils consistently performing equally good or better than C11 atomics in every aspect will make them easy to sell.

@chrysn
Copy link
Member

chrysn commented Oct 1, 2021

Are you sure there is much to optimize? I see the same 24 bytes difference, 8 in rmutex_unlock and 16 in _lock, that's not nothing but not the end of the world either, especially if it does indeed fix a race condition as you mention. I don't see the race condition clearly yet, but comparing the the old and the new disassembled code (where dmb ish instructions at 4 bytes each are inserted around the store of the owner), it looks like that's exactly the difference. (And it's probably two because the consensus around atomic utils was that seq-cst is what's best understood by everyone, so no halving this unless atomic-utils change.)

By the way, I still can't build the test (even with #16923 cherry-picked, BOARD=arduino-mega2560 make -C tests/rmutex all BUILD_IN_DOCKER=1 TOOLCHAIN=llvm CONTINUE_ON_EXPECTED_ERRORS=1 fails for me somewhere in the __asm__ code of cpu.h), is there a known good board this works on?

@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Oct 1, 2021
@maribu
Copy link
Member Author

maribu commented Oct 1, 2021

is there a known good board this works on?

No, I PR'ed everything from the effort in #16924 independently that was ready for review already. But you could revert this PR in #16924 and see that compilation starts to fail.

@benpicco benpicco added CI: run tests If set, CI server will run tests on hardware for the labeled PR CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jan 3, 2022
Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good - anything that speaks against this?

@maribu
Copy link
Member Author

maribu commented Jan 3, 2022

A slight increase in code size, that some micro-optimizations in the atomic utils could improve again. But my work on that has stalled, sadly. Too much areas to work on, too little time ...

@benpicco
Copy link
Contributor

benpicco commented Jan 3, 2022

Looks like there are some problems on ztimer64 too

@maribu
Copy link
Member Author

maribu commented Jan 3, 2022

Ah, yes. It has a separate implementation of ztimer{,64}_rmutex_unlock() that also needs to be adapted. I'll rebase and do so once the kids are in bed.

This will make it easier to change the width of kernel_pid_t if needed
without breaking code.
@github-actions github-actions bot added the Area: tests Area: tests and testing framework label Jan 13, 2022
Replace use of C11 atomics with atomic utils. This fixes

> error: address argument to atomic operation must be a pointer to a
>        trivially-copyable type ('_Atomic(int) *' invalid)

error when compiling on AVR with LLVM.
@maribu maribu added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jan 14, 2022
@maribu
Copy link
Member Author

maribu commented Jan 15, 2022

A single unrelated test failed. I'm restarting the CI without tests again.

@maribu maribu added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: run tests If set, CI server will run tests on hardware for the labeled PR CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jan 15, 2022
@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Feb 3, 2022
@benpicco benpicco merged commit 4aa90d3 into RIOT-OS:master Feb 4, 2022
@OlegHahm OlegHahm added this to the Release 2022.04 milestone Apr 25, 2022
@maribu maribu deleted the core/rmutex branch May 27, 2022 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: sys Area: System Area: tests Area: tests and testing framework Area: timers Area: timer subsystems CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants